# Element-wise mathematical operators and iterator slides

Posted by Chris G

I recently did engage in a quite elaborate discussion on the julia-stats mailing list about mathematical operators for `DataFrames`

in Julia. Although I still do not agree with all of the arguments that were stated (at least not yet), I did get a very comforting feeling about the lively and engaged Julia community once again. Even one of the most active and busiest community members, John Myles White, did take the time to elaborately explain his point of view in the discussion – and this just might be the even higher good to me. Different opinions will always be part of any community. But it is the transparency of the discussions that tell you how strong a community is.

Still, however, mathematical operators are important to me, as I am quite frequently working with strictly real numeric data: no `Strings`

, and no columns of categorical IDs. Given Julia’s expressive language, it would be quite easy to implement any desired mathematical operators for `DataFrames`

on my own. However, I decided to follow what seems to be the consensus of the `DataFrame`

developers, and hence refrain from any individual deviations in this direction. Alternatively, I decided to simply relate any element-wise operators of multi-column `DataFrames`

to `DataArray`

arithmetic, which allow most mathematical operators for individual columns. Viewed from this perspective, element-wise `DataFrame`

operators are nothing else than operators that are successively applied to individual columns of a `DataFrame`

, which are `DataArrays`

.

As a consequence of this, I had to deepen my understanding of iterators, comprehensions and functions like `vcat`

, `map`

and `reduce`

. For future reference, I did sum up my insights in a slide deck, which anybody who is interested could find here, or as part of my IJulia notebook collection here.

For those of you who are using the TimeData package, the current road-map regarding mathematical operators will be the following: any types that are constrained to numeric values only (including the extension to `NA`

values) will carry on providing mathematical operators. These operators do perform some minimal checks upfront, in order to minimize risk of meaningless applications (for example, only adding up columns with equal names, equal dates,…). Furthermore, for any type that allows values other than numeric data these mathematical operators will not be defined. Hence, anybody in need of element-wise arithmetic for numeric data could easily make use of either `Timematr`

or `Timenum`

types (even if you do not need any time index). If you do, however, make sure to not mix up real numeric data and categorical data: applying mathematical operators or statistical functions like `mean`

to something like customer IDs most likely will lead to meaningless results.

Posted on 2014/08/06, in Julia and tagged iterators, map, slides. Bookmark the permalink. 2 Comments.

Thanks for these slides, lots of very helpful stuff compiled in one place.

One thing that might be nice to know is that you can spread a tuple into multiple named variables inside a comprehension, which can make for slightly clearer access to the contents of a column in a DataFrame. So you can do something like

`[col for (name, col) in eachcol(df)]`

instead of

`[col[2] for col in eachcol(df)]`

I didn’t know that yet – thanks a lot for pointing it out!