Skip to content

v2.5.0

Compare
Choose a tag to compare
@mjakubowski84 mjakubowski84 released this 24 Apr 17:17
· 125 commits to master since this release

Release 2.5.0 continues evolution of ParquetIterable. By taking advantage of previously introduced compound iterables a support for reading partitioned data is now introduced to core module. Unlike in Akka and FS2 module, reading partitions must be enabled explicitly. Such an approach is chosen because looking for partitions adds an I/O overhead which is unwelcome in low level libraries. You can enable a new feature by just calling partitioned switch in the builder:

ParquetReader.as[YourSchema].partitioned.read(yourPath)

Moreover, number experimental ETL features is growing. A convenient way of writing datasets is added. You can now call writeAndClose on ParquetIterable directly to write the dataset and release all open resources. This makes the ETL DSL clean and more readable.