Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid allocation when negating BitArray #2497

Merged
merged 1 commit into from
Oct 23, 2020
Merged

Conversation

OkonSamuel
Copy link
Contributor

@OkonSamuel OkonSamuel commented Oct 22, 2020

avoid allocation when negating BitArray in dropmissing! method

## this pr
julia> Random.seed!(0);

julia> rv1 = rand(Bool, 10000);

julia> rv2 = rand(Bool, 10000);

julia> mv1 = [rv1[i] ? missing : i for i in eachindex(rv1)];

julia> mv2 = [rv2[i] ? missing : Float64(i) for i in eachindex(rv2)];

julia> df1 = DataFrame(i = 1:10000, x=mv1, y=mv2);

julia> df2 = DataFrame(i = 1:10000, x=mv1, y=mv2);

julia> df3 = DataFrame(i = 1:10000, x=mv1, y=mv2);

julia> @btime dropmissing!($df1, :x);
  4.697 μs (18 allocations: 5.28 KiB)

julia> @btime dropmissing!($df2, :y);
  5.310 μs (18 allocations: 5.28 KiB)

julia> xy = [:x, :y];

julia> @btime dropmissing!($df3, $xy);
  18.728 μs (46 allocations: 11.34 KiB)

## master
julia> Random.seed!(0);

julia> rv1 = rand(Bool, 10000);

julia> rv2 = rand(Bool, 10000);

julia> mv1 = [rv1[i] ? missing : i for i in eachindex(rv1)];

julia> mv2 = [rv2[i] ? missing : Float64(i) for i in eachindex(rv2)];

julia> df1 = DataFrame(i = 1:10000, x=mv1, y=mv2);
df2
julia> df2 = DataFrame(i = 1:10000, x=mv1, y=mv2);

julia> df3 = DataFrame(i = 1:10000, x=mv1, y=mv2);

julia> @btime dropmissing!($df1, :x);
  5.281 μs (21 allocations: 6.06 KiB)

julia> @btime dropmissing!($df2, :y);
  5.766 μs (21 allocations: 6.06 KiB)

julia> xy = [:x, :y];

julia> @btime dropmissing!($df3, $xy);
  19.032 μs (48 allocations: 11.77 KiB)

avoid allocation when negating BitArray in `dropmissing!` method
@bkamins
Copy link
Member

bkamins commented Oct 22, 2020

Looks good. Can you submit here some performance benchmarks for comparison before we merge? Thank you!

@OkonSamuel
Copy link
Contributor Author

Looks good. Can you submit here some performance benchmarks for comparison before we merge? Thank you!

sure.

@OkonSamuel
Copy link
Contributor Author

@bkamins done

@bkamins bkamins merged commit 40c368d into JuliaData:master Oct 23, 2020
@bkamins
Copy link
Member

bkamins commented Oct 23, 2020

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants