Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow.write() cannot handle large Mmap-ed table #413

Open
Moelf opened this issue Apr 3, 2023 · 0 comments
Open

Arrow.write() cannot handle large Mmap-ed table #413

Moelf opened this issue Apr 3, 2023 · 0 comments

Comments

@Moelf
Copy link
Contributor

Moelf commented Apr 3, 2023

Premise:

  1. input file is large, uncompressed Arrow file
  2. we produce a mask and produce a view() over the Mmap-ed table
  3. use Arrow.write() to write filtered table to disk

This seems to take increasing memory as the content of the mask.

I know this doesn't work correctly because if I set memory limit first:

> ulimit -Sv 8000000
julia> using Arrow, DataFrames

julia> const df = @time DataFrame(Arrow.Table("./nanoAOD_nocomp.feather"); copycols=false);
  2.685720 seconds (4.84 M allocations: 321.109 MiB, 4.41% gc time, 100.89% compilation time)

julia> Arrow.write("/home/akako/Downloads/out.feather", @view df[1:1*10^4, :]);

julia> Arrow.write("/home/akako/Downloads/out.feather", @view df[1:2*10^4, :]);
ERROR: Internal error: encountered unexpected error in runtime:
OutOfMemoryError()
unknown function (ip: 0x7f9d7329fc99)
unknown function (ip: 0x7f9d732935b5)
jl_gc_alloc at /home/akako/Documents/github/dotFiles/homedir/.julia/juliaup/julia-1.9.0-rc1+0.x64.linux.gnu/bin/../lib/julia/libjulia-internal.so.1 (unknown line)
ijl_alloc_array_1d at /home/akako/Documents/github/dotFiles/homedir/.julia/juliaup/julia-1.9.0-rc1+0.x64.linux.gnu/bin/../lib/julia/libjulia-internal.so.1 (unknown line)
unknown function (ip: 0x7f9d5ec6259e)
unknown function (ip: 0x7f9d5e36c9dc)
unknown function (ip: 0x7f9d5e73ee1f)
unknown function (ip: 0x7f9d5e73ed98)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant