-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: GroupBy.idxmax/idxmin with EA dtypes #38733
Conversation
To be clear, the open issues are mainly about the behaviour for argmin/max with skipna=False, not for idxmin/max (but since idxmin/max is typically implemented based on argmin/max, it's still impacted) |
@jorisvandenbossche do the blockers you mentioned have some momentum to them? i.e. if I want to get this fixed, do i need to go work on those directly or can i just be patient for a bit? |
gentle ping @jorisvandenbossche thoughts on how to make this actionable? |
By looking into those linked issues, and trying to move them forward .. Now, I think this PR can already fix the non-skipna case in the meantime? |
pandas/core/arrays/base.py
Outdated
@@ -596,7 +596,7 @@ def argsort( | |||
mask=np.asarray(self.isna()), | |||
) | |||
|
|||
def argmin(self): | |||
def argmin(self, axis=None, skipna: bool = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The axis
keyword shouldn't be added, since it's being ignored. It can be accepted through args/kwargs, and then validated with our numpy compat (+ tests for the numpy compat)
pandas/core/arrays/base.py
Outdated
@@ -611,9 +611,12 @@ def argmin(self): | |||
-------- | |||
ExtensionArray.argmax | |||
""" | |||
validate_bool_kwarg(skipna, "skipna") | |||
if not skipna and self.isna().any(): | |||
return -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those should not return -1, IMO. I would prefer to not put this "wrong" behaviour in the EAs (but keep it in Series were it already has this behaviour), as long as we didn't decide on the preferred behaviour.
Short term, could eg raise an error saying that skipna=False is not yet implemented, but by accepting skipna
, it can already fix the default groupby case (see my comment at #38733 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could eg raise an error saying that skipna=False is not yet implemented
so... like i had in the previous commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so... like i had in the previous commit?
Yes, that would be my preference.
(sorry for not being explicit about it, but my earlier comment of "this PR can already fix the non-skipna case in the meantime?" quite well matched the state of the PR, it was mainly to mean: we don't have to solve the skipna=False discussion to merge a PR like this to fix the skipna=True case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a whatsnew note?
updated |
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
updated + green |
return nargminmax(self, "argmin") | ||
|
||
def argmax(self): | ||
def argmax(self, skipna: bool = True) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update the Parameters section here (and for argmin)
@@ -628,6 +631,9 @@ def argmax(self): | |||
-------- | |||
ExtensionArray.argmin | |||
""" | |||
validate_bool_kwarg(skipna, "skipna") | |||
if not skipna and self.isna().any(): | |||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test that hits this (and argmin)
updated docstrings, not sure about testing for the NotImplementedError cases since that isnt actually desired behavior |
sure but we want to lock it down to make sure that it is not actually changed w/o doing anything about it. |
test added + greenish |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
@jorisvandenbossche thoughts on how/where to handle the skipna kwarg?