Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow [:a,:b,:c,:d] => fun => new_column_name syntax #2256

Closed
floswald opened this issue May 14, 2020 · 2 comments
Closed

allow [:a,:b,:c,:d] => fun => new_column_name syntax #2256

floswald opened this issue May 14, 2020 · 2 comments

Comments

@floswald
Copy link
Contributor

I was trying to do this

using DataFrames
julia> df = DataFrame(a=1:3, b=4:6,c = rand(3))

# expected
julia> select(df, [:a,:b] => (a,b) -> a./b)
3×1 DataFrame
│ Row │ a_b_function │
│     │ Float64      │
├─────┼──────────────┤
│ 1   │ 0.25         │
│ 2   │ 0.4          │
│ 3   │ 0.5          │

# not expected!
julia> select(df, [:a,:b] => (a,b) -> a./b => :d)
3×1 DataFrame
│ Row │ a_b_function         │
│     │ Pair…                │
├─────┼──────────────────────┤
│ 1   │ [0.25, 0.4, 0.5]=>:d │
│ 2   │ [0.25, 0.4, 0.5]=>:d │
│ 3   │ [0.25, 0.4, 0.5]=>:d │

# not expected either
julia> select(df, [:a, :b] => (a,b) -> a ./ b => "new")
3×1 DataFrame
│ Row │ a_b_function            │
│     │ Pair…                   │
├─────┼─────────────────────────┤
│ 1   │ [0.25, 0.4, 0.5]=>"new" │
│ 2   │ [0.25, 0.4, 0.5]=>"new" │
│ 3   │ [0.25, 0.4, 0.5]=>"new" │

It seems that given you allow that syntax when transforming a single column, why not also on multiple ones? the problem with the current approach? well, if I want to add two transformed columns which get the same generated name, I can't do that in one go. Or what's the best approach here?

julia> select(df, [:a, :b] => (a,b) -> a ./ b, [:a, :b] => (a,b) -> a .+ b)
ERROR: ArgumentError: duplicate target column name a_b_function passed
@bkamins
Copy link
Member

bkamins commented May 14, 2020

This is the problem with Julia Base not with DataFrames.jl.

Note the difference:

julia> x -> x => :new
#5 (generic function with 1 method)

julia> (x -> x) => :new
var"#7#8"() => :new

So, as you can see, you were struck by the operator precedence issue, and you have to wrap the anonymous function in ( and ) like this:

julia> select(df, [:a,:b] => ((a,b) -> a./b) => :d)
3×1 DataFrame
│ Row │ d       │
│     │ Float64 │
├─────┼─────────┤
│ 1   │ 0.25    │
│ 2   │ 0.4     │
│ 3   │ 0.5     │

julia> select(df, [:a, :b] => ((a,b) -> a ./ b) => "new")
3×1 DataFrame
│ Row │ new     │
│     │ Float64 │
├─────┼─────────┤
│ 1   │ 0.25    │
│ 2   │ 0.4     │
│ 3   │ 0.5     │

@bkamins bkamins closed this as completed May 14, 2020
@floswald
Copy link
Contributor Author

cool! thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants