-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add drop to a case class #212
Conversation
Codecov Report
@@ Coverage Diff @@
## master #212 +/- ##
==========================================
+ Coverage 95.86% 95.99% +0.12%
==========================================
Files 42 42
Lines 847 848 +1
Branches 11 11
==========================================
+ Hits 812 814 +2
+ Misses 35 34 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look really good! Let's discuss a bit the comments and we can merge.
* | ||
* @see [[frameless.TypedDataset.dropApply#apply]] | ||
*/ | ||
def drop[U] = new dropApply[U] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks kind of strange that the types are documented here, but they are not defined here ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it looks strange but it's the same thing that I did
frameless/dataset/src/main/scala/frameless/TypedDataset.scala
Lines 669 to 702 in 4dbc2b3
/** Adds a column to a Dataset so long as the specified output type, `U`, has | |
* an extra column from `T` that has type `A`. | |
* | |
* @example | |
* {{{ | |
* case class X(i: Int, j: Int) | |
* case class Y(i: Int, j: Int, k: Boolean) | |
* val f: TypedDataset[X] = TypedDataset.create(X(1,1) :: X(1,1) :: X(1,10) :: Nil) | |
* val fNew: TypedDataset[Y] = f.withColumn[Y](f('j) === 10) | |
* }}} | |
* @param ca The typed column to add | |
* @param i0 TypeEncoder for output type U | |
* @param i1 TypeEncoder for added column type A | |
* @param i2 the LabelledGeneric derived for T | |
* @param i3 the LabelledGeneric derived for U | |
* @param i4 proof no fields have been removed | |
* @param i5 diff from T to U | |
* @param i6 keys from newFields | |
* @param i7 the one and only new key | |
* @param i8 the one and only new field enforcing the type of A exists | |
* @param i9 the keys of U | |
* @param iA allows for traversing the keys of U | |
* @tparam U the output type | |
* @tparam A The added column type | |
* @tparam TRep shapeless' record representation of T | |
* @tparam URep shapeless' record representation of U | |
* @tparam UKeys the keys of U as an HList | |
* @tparam NewFields the added fields to T to get U | |
* @tparam NewKeys the keys of NewFields as an HList | |
* @tparam NewKey the first, and only, key in NewKey | |
* | |
* @see [[frameless.TypedDataset.withColumnApply#apply]] | |
*/ | |
def withColumn[U] = new withColumnApply[U] |
The reason for doing so is to get the nice scaladoc in IDEs etc on the method call site that the user will see. I don't have strong feelings either way but originally moved it due to the suggestion #208 (comment).
Whatever the decision, we should be consistent and update both this and withColumn
.
check(prop[SQLDate] _) | ||
check(prop[Option[X1[Boolean]]] _) | ||
test("remove column in the middle") { | ||
val f: TypedDataset[X] = TypedDataset.create(X(1,1, false) :: X(1,1, false) :: X(1,10, false) :: Nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: spacing seems incostend. (same for other X literals)
*/ | ||
def drop[U] = new dropApply[U] | ||
|
||
class dropApply[U] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropApply can have a regular class name DropApply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'll update
class withColumnApply[U] { |
@imarios Thanks once again for the review! Let me know how to proceed regarding #212 (comment) |
@frosforever I had this thought after finishing the report: Isn't |
@imarios Ha that's an excellent point!
What do you think? |
I think keeping drop for consistency with spark makes sense. How about we keep |
1c05392
to
0fab834
Compare
@imarios agreed. Updated PR. |
Connects to #163 and #209
Similar to #208 adds support for dropping a column from a dataset with greater than 22 columns by returning a specified case class instead of a tuple.