BUG: regression in error raised by idxmin/idxmax for extension dtypes #32749

jorisvandenbossche · 2020-03-16T13:21:04Z

With pandas 1.0.1:

In [7]: from pandas.tests.extension.decimal import DecimalArray, make_data 

In [8]: s = pd.Series(DecimalArray(make_data()[:5]))  

In [9]: s 
Out[9]: 
0    Decimal: 0.25866656130631138221787068687262944...
1    Decimal: 0.11511905452780368808163302674074657...
2    Decimal: 0.15301679241167220890673661415348760...
3    Decimal: 0.41125672759464626526693109553889371...
4    Decimal: 0.46391316685725048074573351186700165...
dtype: decimal

In [10]: s.idxmin()   
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-5060aab05d12> in <module>
----> 1 s.idxmin()

~/miniconda3/envs/pandas10/lib/python3.8/site-packages/pandas/core/series.py in idxmin(self, axis, skipna, *args, **kwargs)
   2037         """
   2038         skipna = nv.validate_argmin_with_skipna(skipna, args, kwargs)
-> 2039         i = nanops.nanargmin(com.values_from_object(self), skipna=skipna)
   2040         if i == -1:
   2041             return np.nan

~/miniconda3/envs/pandas10/lib/python3.8/site-packages/pandas/core/nanops.py in _f(*args, **kwargs)
     62             if any(self.check(obj) for obj in obj_iter):
     63                 f_name = f.__name__.replace("nan", "")
---> 64                 raise TypeError(
     65                     f"reduction operation '{f_name}' not allowed for this dtype"
     66                 )

TypeError: reduction operation 'argmin' not allowed for this dtype

Now with master you get:

In [4]: s.idxmin() 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-7e62303a9d43> in <module>
----> 1 s.idxmin()

~/scipy/pandas/pandas/core/series.py in idxmin(self, axis, skipna, *args, **kwargs)
   1988         """
   1989         skipna = nv.validate_argmin_with_skipna(skipna, args, kwargs)
-> 1990         i = nanops.nanargmin(self._values, skipna=skipna)
   1991         if i == -1:
   1992             return np.nan

~/scipy/pandas/pandas/core/nanops.py in _f(*args, **kwargs)
     69             try:
     70                 with np.errstate(invalid="ignore"):
---> 71                     return f(*args, **kwargs)
     72             except ValueError as e:
     73                 # we want to transform an object array

~/scipy/pandas/pandas/core/nanops.py in nanargmin(values, axis, skipna, mask)
    943     """
    944     values, mask, dtype, _, _ = _get_values(
--> 945         values, True, fill_value_typ="+inf", mask=mask
    946     )
    947     result = values.argmin(axis)

~/scipy/pandas/pandas/core/nanops.py in _get_values(values, skipna, fill_value, fill_value_typ, mask)
    311         # promote if needed
    312         else:
--> 313             values, _ = maybe_upcast_putmask(values, mask, fill_value)
    314 
    315     # return a platform independent precision dtype

~/scipy/pandas/pandas/core/dtypes/cast.py in maybe_upcast_putmask(result, mask, other)
    278     """
    279     if not isinstance(result, np.ndarray):
--> 280         raise ValueError("The result input must be a ndarray.")
    281     if not is_scalar(other):
    282         # We _could_ support non-scalar other, but until we have a compelling

ValueError: The result input must be a ndarray.

@jbrockmendel this is from no longer using values_from_object, but passing _values to the nanops function, while this expects to always receive an ndarray

The text was updated successfully, but these errors were encountered:

jbrockmendel · 2020-03-16T15:05:55Z

sounds like a 2-liner to check for non-ndarray and raise the old exception?

jorisvandenbossche · 2020-03-16T15:10:51Z

Maybe not necessarily, you might have EAs that can be converted to numerical ndarrays?

jbrockmendel · 2020-03-16T15:55:52Z

you might have EAs that can be converted to numerical ndarrays?

Should values_for_argsort work for that?

jorisvandenbossche · 2020-03-17T07:54:00Z

I suppose so? I suppose if argsort works on the values, also argmin/argmax should do the expected thing?

TomAugspurger · 2020-07-07T21:26:00Z

I'd like to fix this by having Series.idxmin() eventually call .array._reduce("argmin"). However, there's a potential issue with how we've defined argmin on our arrays for missing values.

In [52]: pd.array([1, 2, None], dtype="float64")._reduce("argmin", skipna=False)  # PandasArray
Out[52]: <NA>

In [53]: pd.array([1, 2, None], dtype="Int64")._reduce("argmin", skipna=False)
Out[53]: -1

IMO, _reduce should return the correct result, which in this case is I think NA (under the rule that skipna propagates NA).

jorisvandenbossche · 2020-07-08T12:20:06Z

However, there's a potential issue with how we've defined argmin on our arrays for missing values.

@TomAugspurger see also #33941, #33942

simonjayhawkins · 2020-08-18T10:28:05Z

moved off 1.1.1 milestone (scheduled for this week) as no PRs to fix in the pipeline

simonjayhawkins · 2020-09-07T09:25:15Z

moved off 1.1.2 milestone (scheduled for this week) as no PRs to fix in the pipeline

simonjayhawkins · 2020-09-07T14:55:34Z

first bad commit: [8c38283] CLN: avoid values_from_object in Series (#32426)

simonjayhawkins · 2020-10-05T12:37:47Z

moved off 1.1.3 milestone (overdue) as no PRs to fix in the pipeline

simonjayhawkins · 2020-10-29T14:04:23Z

moved off 1.1.4 milestone (scheduled for release tomorrow) as no PRs to fix in the pipeline

jorisvandenbossche added this to the 1.1 milestone Mar 16, 2020

jorisvandenbossche mentioned this issue Mar 16, 2020

CLN: avoid values_from_object in Series #32426

Merged

cklb mentioned this issue Apr 22, 2020

BUG: idxmin() fails for nullable integer data type (Int64) #33719

Closed

3 tasks

simonjayhawkins added Error Reporting Incorrect or improved errors from pandas Regression Functionality that used to work in a prior pandas version ExtensionArray Extending pandas with custom dtypes or arrays. labels Apr 25, 2020

mroeschke added the Bug label May 11, 2020

jorisvandenbossche mentioned this issue Jun 13, 2020

Release 0.8.0 geopandas/geopandas#1432

Closed

25 tasks

jreback modified the milestones: 1.1, 1.1.1 Jul 10, 2020

martinfleis mentioned this issue Jul 28, 2020

BUG: test_numerical_operations fails under pandas 1.1.0 geopandas/geopandas#1541

Closed

brendan-ward mentioned this issue Jul 29, 2020

TST: Fix CI for pandas 1.1.0 geopandas/geopandas#1544

Merged

simonjayhawkins modified the milestones: 1.1.1, 1.1.2 Aug 18, 2020

simonjayhawkins modified the milestones: 1.1.2, 1.1.3 Sep 7, 2020

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Sep 7, 2020

add code sample for pandas-dev#32749

2521f05

nrebena mentioned this issue Sep 21, 2020

Regr/period range large value/issue 36430 #36535

Merged

5 tasks

simonjayhawkins modified the milestones: 1.1.3, 1.1.4 Oct 5, 2020

simonjayhawkins modified the milestones: 1.1.4, 1.1.5 Oct 29, 2020

jreback modified the milestones: 1.1.5, Contributions Welcome Nov 25, 2020

tonyyyyip mentioned this issue Dec 5, 2020

BUG: idxmax/min (and argmax/min) for Series with underlying ExtensionArray #37924

Merged

5 tasks

jorisvandenbossche modified the milestones: Contributions Welcome, 1.3 Dec 29, 2020

jreback closed this as completed in #37924 Jan 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: regression in error raised by idxmin/idxmax for extension dtypes #32749

BUG: regression in error raised by idxmin/idxmax for extension dtypes #32749

jorisvandenbossche commented Mar 16, 2020 •

edited

Loading

jbrockmendel commented Mar 16, 2020

jorisvandenbossche commented Mar 16, 2020

jbrockmendel commented Mar 16, 2020

jorisvandenbossche commented Mar 17, 2020 •

edited

Loading

TomAugspurger commented Jul 7, 2020

jorisvandenbossche commented Jul 8, 2020

simonjayhawkins commented Aug 18, 2020

simonjayhawkins commented Sep 7, 2020

simonjayhawkins commented Sep 7, 2020

simonjayhawkins commented Oct 5, 2020

simonjayhawkins commented Oct 29, 2020

BUG: regression in error raised by idxmin/idxmax for extension dtypes #32749

BUG: regression in error raised by idxmin/idxmax for extension dtypes #32749

Comments

jorisvandenbossche commented Mar 16, 2020 • edited Loading

jbrockmendel commented Mar 16, 2020

jorisvandenbossche commented Mar 16, 2020

jbrockmendel commented Mar 16, 2020

jorisvandenbossche commented Mar 17, 2020 • edited Loading

TomAugspurger commented Jul 7, 2020

jorisvandenbossche commented Jul 8, 2020

simonjayhawkins commented Aug 18, 2020

simonjayhawkins commented Sep 7, 2020

simonjayhawkins commented Sep 7, 2020

simonjayhawkins commented Oct 5, 2020

simonjayhawkins commented Oct 29, 2020

jorisvandenbossche commented Mar 16, 2020 •

edited

Loading

jorisvandenbossche commented Mar 17, 2020 •

edited

Loading