-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: Remove unnecessary kwargs in groupby ops #50483
Conversation
pandas/core/groupby/groupby.py
Outdated
""" | ||
Cumulative sum for each group. | ||
|
||
Returns | ||
------- | ||
Series or DataFrame | ||
""" | ||
nv.validate_groupby_func("cumsum", args, kwargs, ["numeric_only", "skipna"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI is failing because of this. You can't simply delete the values, you should provide an empty tuple and an empty dictionary instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am getting this error Error: The runner has received a shutdown signal...' in GHA. GH 45651
, could you help on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add an entry in the whatsnew notes for this change.
Will do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! We should only be deprecating at this point. The arguments can only be removed after a deprecation period. The removal won't happen until we begin working on pandas 3.0.
Thanks I will do the changes |
I looked over the net but could not find out how to deprecate this, can you please explain. I thought that we gad to remove the arguments when I opened the PR. |
The idea is that in the next version we start showing warnings to the users in the next version, to let them know if they are passing the arguments that they will be removed, and some versions later we will actually remove them, once they had the time to update their code. I'd personally still fix this PR with the removals first. The parameters are not veing used, but I wonder if there was any reason we are misssing to add them in the first place. So, it'll be good to see that nothing in our test suite breaks when removing them, before we apply the deprecation warnings. For the warnings, check at closed PRs with the Deprecate label, and find one about arguments. We have a decorator to do it, but I'm in my phone and I fon't remember the exact name, something like |
thanks i'll look for closed pr with |
@datapythonista - these arguments are to support passing groupby objects to NumPy; #47836 and #48277 |
so should we add this |
Thanks @rhshadrach for the info, I wasn't aware of that, good to know. @ramvikrams |
Yeah I'll push a commit in 30 mins |
I think it's done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes look good except I'm not seeing cummin / cummax and this needs tests.
For tests, you want to invoke the function and ensure the appropriate warning is raised. Use with tm.assert_produces_warning(FutureWarning, match="..."):
with the appropriate warning message filled in.
doc/source/whatsnew/v2.0.0.rst
Outdated
@@ -735,6 +735,7 @@ Removal of prior version deprecations/changes | |||
- Removed the deprecated argument ``line_terminator`` from :meth:`DataFrame.to_csv` (:issue:`45302`) | |||
- Removed the deprecated argument ``label`` from :func:`lreshape` (:issue:`30219`) | |||
- Arguments after ``expr`` in :meth:`DataFrame.eval` and :meth:`DataFrame.query` are keyword-only (:issue:`47587`) | |||
- Deprecate argument ``**kwargs`` from :meth:`cummax`, :meth:`cummin`, :meth:`cumsum`, :meth:`cumprod`, :meth:`take` and :meth:`skew` (:issue:`50483`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently in the section of deprecations that we are enforcing for 2.0. Instead, it should be in the new deprecations that are introduced in 2.0. That's one section up from this.
Can you add "unused" before "argument" to indicate to users we're not removing any functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes surely i'll do the changes
@@ -155,3 +165,17 @@ def test_class(): | |||
) | |||
with tm.assert_produces_warning(FutureWarning, match=msg): | |||
Foo().baz("qux", "quox") | |||
|
|||
|
|||
def test_deprecated_keywords(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is for testing the decorator itself, these tests should go in pandas/tests/groupby/test_function
.
"and kwargs will be deprecated for these functions" | ||
) | ||
with tm.assert_produces_warning(FutureWarning, match=msg): | ||
assert cummax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert [statement]
evaluates the truthiness of the statement. For example, []
will evaluate to False (and hence raise if used in an assert) whereas [1]
will evaluate to True. Functions are always truthy - they evaluate to True.
Instead, this needs to call the function itself, e.g.
df = DataFrame({'a': [1, 1, 2], 'b': [3, 4, 5]})
gb = df.groupby('a')
gb.cummax(kwargs=1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thanks I'll do the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did the changes but still the tests are failing, don't know why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments
assert gb.cummax(kwargs=1) | ||
assert gb.cummin(kwargs=1) | ||
assert gb.cumsum(args=1, kwargs=1) | ||
assert gb.cumprod(args=1, kwargs=1) | ||
assert gb.skew(kwargs=1) | ||
assert gb.take(kwargs=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert
is used to assert that a certain condition is true. Do you want to assert something here, or just call these functions?
Also, what will happen with this test if cummax
produces the warning, but the other functions don't? Seems like the test will pass, even if the changes don't do what we want them to do. You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with getattr(gb, func_name)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with
getattr(gb, func_name)
.
i'll try this way
@@ -893,6 +894,9 @@ def fillna( | |||
return result | |||
|
|||
@doc(Series.take.__doc__) | |||
@deprecate_nonkeyword_arguments( | |||
version="3.0", allowed_args=["self", "axis", "indices"] | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having another look, looks like deprecate_nonkeyword_arguments
doesn't deprecate the argument, but makes it usable only via a keyword argument (e.g. foo(1)
would produce the warning, but foo(val=1)
wouldn't). If that's the case, doesn't seem like we've got a decorator to simply deprecate an argument. In that case, you'll have to produce it manually.
with tm.assert_produces_warning(FutureWarning, match=msg): | ||
df = DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]}) | ||
gb = df.groupby("a") | ||
assert gb.cummax(kwargs=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using kwargs
as the sample keyword argument is a bit misleading. Something like gb.cummax(incorrect_argument=
)` would probably be clearer.
gb = df.groupby("a") | ||
assert gb.cummax(kwargs=1) | ||
assert gb.cummin(kwargs=1) | ||
assert gb.cumsum(args=1, kwargs=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you've got this wrong. This is not testing args
at all, to test if the *args
deprecation is working you need to use gb.cumsum(1, incorrect_argument=1)
. And I'd recommend splitting this into two separate tests, otherwise you will only test that one of them works, not that both work.
I'd recommend that you make sure you understand well what the *
and **
do in a function signature like def foo(my_arg, *args, **kwargs):
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
@ramvikrams do you want to keep working on this PR? |
i think it's better I close it |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.