You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The assertion check is len < capacity, where len is the number of boolean null/non-null bit values, and capacity is buffer.len() * 8, with buffer.len() the size of the buffer containing those bits. If len is a multiple of 8, say 8b, then the buffer used to store it has length (buffer.len()) b, and len == capacity. This valid situation fails the assertion check.
To Reproduce
// Assume pa: PrimitiveArray holding values including nulls, with length a multiple of 8let b = pa.into_builder().expect("into_builder")
thread 'main' panicked at .../.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-buffer-50.0.0/src/builder/null.rs:57:9:
assertion failed: len < capacity
stack backtrace:
...
2: core::panicking::panic
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:127:5
3: arrow_buffer::builder::null::NullBufferBuilder::new_from_buffer
4: arrow_array::builder::primitive_builder::PrimitiveBuilder::new_from_buffer::{{closure}}
at .../.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-50.0.0/src/builder/primitive_builder.rs:164:27
5: core::option::Option::map
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/option.rs:1072:29
6: arrow_array::builder::primitive_builder::PrimitiveBuilder::new_from_buffer
at .../.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-50.0.0/src/builder/primitive_builder.rs:163:35
7: arrow_array::array::primitive_array::PrimitiveArray::into_builder
at .../.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-50.0.0/src/array/primitive_array.rs:946:46
Expected behavior into_builder() succeeds.
Additional context
In my use case, I am parsing a text file containing number values (or some value indicating "null"), and store these in a PrimitiveBuilder (for later converting to an Array, and further processing). I do not know in advance what number type a series of values has, they might for example all be integers, but it could also be that the first (few) ones look like integers, but later ones are floats. So, I start out with e.g. a PrimitiveBuilder, parse the next string value into an int, if it parses ok I add the value, if the parse fails I try to parse it as e.g. an f32. If that succeeds, then apparently this series of values is actually a series of floats, not ints. So I want to basically cast/convert the builder with the already collected values to another type, and continue my parsing.
One approach I tried:
use num::cast::{AsPrimitive,NumCast};traitCaster<U>{fncast(&mutself) -> PrimitiveBuilder<U>whereU:ArrowPrimitiveType;}impl<T,U>Caster<U>forPrimitiveBuilder<T>whereT:ArrowPrimitiveType,U:ArrowPrimitiveType,T::Native:AsPrimitive<U::Native>{fncast(&mutself) -> PrimitiveBuilder<U>{let src_array = self.finish();let dst_array = src_array.unary::<_,U>(AsPrimitive::<U::Native>::as_);
src_array.into_builder().expect("Converting array to builder")}}
Here, if the source contained nulls, the into_builder returns an Err(...), I assume related to how the unary function clones the NullBuffer. It would be nice if into_builder handles this better, by creating a new null buffer if it cannot reuse the existing one.
The next approach I tried:
use num::cast::NumCast;traitCaster<U>{fn_cast(&mutself) -> PrimitiveBuilder<U>whereU:ArrowPrimitiveType,U::Native:NumCast;}impl<T,U>Caster<U>forPrimitiveBuilder<T>whereT:ArrowPrimitiveType,T::Native:NumCast,U:ArrowPrimitiveType,U::Native:NumCast{fncast(&mutself) -> PrimitiveBuilder<U>{let src_array = self.finish();let dst_array = src_array.unary_opt::<_,U>(num::cast::cast::<T::Native,U::Native>);
src_array.into_builder().expect("Converting array to builder")}}
This panics due to the len < capacity assertion, if the source contains nulls and has a length divisible by 8.
The text was updated successfully, but these errors were encountered:
Describe the bug
In:
arrow-rs/arrow-buffer/src/builder/null.rs
Line 57 in db81108
The assertion check is
len < capacity
, wherelen
is the number of boolean null/non-null bit values, andcapacity
isbuffer.len() * 8
, withbuffer.len()
the size of the buffer containing those bits. Iflen
is a multiple of 8, say8b
, then the buffer used to store it has length (buffer.len()
)b
, andlen == capacity
. This valid situation fails the assertion check.To Reproduce
Expected behavior
into_builder()
succeeds.Additional context
In my use case, I am parsing a text file containing number values (or some value indicating "null"), and store these in a PrimitiveBuilder (for later converting to an Array, and further processing). I do not know in advance what number type a series of values has, they might for example all be integers, but it could also be that the first (few) ones look like integers, but later ones are floats. So, I start out with e.g. a PrimitiveBuilder, parse the next string value into an int, if it parses ok I add the value, if the parse fails I try to parse it as e.g. an f32. If that succeeds, then apparently this series of values is actually a series of floats, not ints. So I want to basically cast/convert the builder with the already collected values to another type, and continue my parsing.
One approach I tried:
Here, if the source contained nulls, the
into_builder
returns an Err(...), I assume related to how theunary
function clones the NullBuffer. It would be nice ifinto_builder
handles this better, by creating a new null buffer if it cannot reuse the existing one.The next approach I tried:
This panics due to the
len < capacity
assertion, if the source contains nulls and has a length divisible by 8.The text was updated successfully, but these errors were encountered: