Missing Dataset methods #163

OlivierBlanvillain · 2017-08-08T09:46:05Z

Here is an exhaustive status of the API implemented by frameless.TypeDataset compared to Spark's Dataset. We are getting pretty close to 100% API coverage 😄

Won't fix:

Dataset alias(String alias) inherently unsafe
Dataset withColumnRenamed(String existingName, String newName) inherently unsafe
void createGlobalTempView(String viewName) inherently unsafe
void createOrReplaceTempView(String viewName) inherently unsafe
void createTempView(String viewName) inherently unsafe
void registerTempTable(String tableName) inherently unsafe
Dataset where(String conditionExpr) use select instead

TODO:

KeyValueGroupedDataset<K,T> groupByKey(scala.Function1<T,K> func, Encoder evidence3)
DataFrameNaFunctions na()
DataFrameStatFunctions stat()
Dataset dropDuplicates(String col1, String... cols)
Dataset describe(String... cols)
DataStreamWriter writeStream() (see Type Spark’s Structured Streaming #232)
Dataset withWatermark(String eventTime, String delayThreshold) (see Type Spark’s Structured Streaming #232)
RelationalGroupedDataset cube(Column... cols) (WIP Add missing Dataset.cube and rollup methods #246)
RelationalGroupedDataset rollup(String col1, String... cols) (WIP Add missing Dataset.cube and rollup methods #246)

Done:

The text was updated successfully, but these errors were encountered:

snadorp · 2018-07-25T12:13:03Z

<A,B> Dataset explode(String inputColumn, String outputColumn, scala.Function1<A,TraversableOnce<B f) is not working for Map type columns. While vanilla Spark supports it.

imarios · 2018-07-25T15:55:06Z

Yes, I was not able to fit Map because its type signature has two holes compared to one for all other. We can have an overloaded method just for Map I think.

etspaceman · 2023-07-14T20:01:21Z

writeStream can be marked as done here.

OlivierBlanvillain added beginner friendly feature labels Nov 17, 2017

frosforever mentioned this issue Nov 17, 2017

I#163 dataset drop #209

Merged

This was referenced Nov 29, 2017

add drop to a case class #212

Merged

withColumn update column value #219

Merged

Avasil mentioned this issue Dec 12, 2017

Forwarding Typed Dataset methods #227

Merged

Avasil mentioned this issue Feb 7, 2018

Add missing Dataset.cube and rollup methods #246

Merged

5 tasks

burakkose mentioned this issue May 2, 2018

Add KeyValueGroupedDataset #293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Dataset methods #163

Missing Dataset methods #163

OlivierBlanvillain commented Aug 8, 2017 •

edited by imarios

Loading

snadorp commented Jul 25, 2018 •

edited

Loading

imarios commented Jul 25, 2018 •

edited

Loading

etspaceman commented Jul 14, 2023

Missing Dataset methods #163

Missing Dataset methods #163

Comments

OlivierBlanvillain commented Aug 8, 2017 • edited by imarios Loading

snadorp commented Jul 25, 2018 • edited Loading

imarios commented Jul 25, 2018 • edited Loading

etspaceman commented Jul 14, 2023

OlivierBlanvillain commented Aug 8, 2017 •

edited by imarios

Loading

snadorp commented Jul 25, 2018 •

edited

Loading

imarios commented Jul 25, 2018 •

edited

Loading