Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure Index._data is an ndarray #23628

Merged

Conversation

TomAugspurger
Copy link
Contributor

Split from #23623, where it was
causing issues.

BUG: Ensure that Index._data is an ndarray

Split from pandas-dev#23623, where it was
causing issues with infer_dtype.
@pep8speaks
Copy link

Hello @TomAugspurger! Thanks for submitting the PR.

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Dtype Conversions Unexpected or buggy dtype conversions labels Nov 11, 2018
@jreback jreback added this to the 0.24.0 milestone Nov 11, 2018
@jreback
Copy link
Contributor

jreback commented Nov 11, 2018

lgtm. circle seems to be failing lots of PRs today.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather fix the places where _simple_new is called with an Index or Series ? Or are that many places (I suppose the bool code path) ?

@TomAugspurger
Copy link
Contributor Author

I started fixing the callers, but gave up when I saw all the potential call sites. It's not just boolean index, the numeric ones hit this as well.

@TomAugspurger
Copy link
Contributor Author

I didn't look into why the expected failures in
fa8934a changed.

@codecov
Copy link

codecov bot commented Nov 12, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@6d031f2). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #23628   +/-   ##
=========================================
  Coverage          ?   92.24%           
=========================================
  Files             ?      161           
  Lines             ?    51341           
  Branches          ?        0           
=========================================
  Hits              ?    47362           
  Misses            ?     3979           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.64% <100%> (?)
#single 42.34% <100%> (?)
Impacted Files Coverage Δ
pandas/core/indexes/base.py 96.46% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d031f2...265ffcb. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

All green.

[ci skip]

***NO CI***
@jbrockmendel
Copy link
Member

Should we rather fix the places where _simple_new is called with an Index or Series ?

+1 for the "sooner side of eventually" pile

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Nov 12, 2018

@jbrockmendel can you comment on the skips / expected values in
https://github.com/pandas-dev/pandas/pull/23628/files/fa8934a3082300c209b4dddc9eb7e34029b6493f#diff-078718e1fe56beae8c2e3be2405f4d19?

It seems to me like the expected values are incorrect. Consider op=ops.rand_. In that case, we're saying that

ops.rand_(series, index) -> Series

but it seems to me that ops.and_(index, series) -> Index, right?


edit: just saw your reply.

@TomAugspurger
Copy link
Contributor Author

Based on my thinking in
#23628 (comment), I've moved the reverse ops to a different test in e4b21f6. Let me know if that seems completely wrong-headed.

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Nov 13, 2018

Any objections to merging? (this is blocking IndexOpsMinx.array / to_numpy())

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment

@@ -522,6 +522,12 @@ def _simple_new(cls, values, name=None, dtype=None, **kwargs):
values = cls(values, name=name, dtype=dtype,
**kwargs)._ndarray_values

if isinstance(values, (ABCSeries, cls)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use ABCIndexClass here? it’s acrually more clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can never remember whether ABCIndexClass or ABCIndex includes subclasses, so I was happy when I realized I could just use cls here :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ABCIndexClass - it’s just more consistent in the code

@jorisvandenbossche
Copy link
Member

@TomAugspurger just to be clear: regarding the behaviour of xor with Index / Series, did you decide that the current behaviour is correct? (as you added a test for it)?

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Nov 14, 2018 via email

@@ -504,6 +504,12 @@ def test_constructor_cast(self):
with pytest.raises(ValueError, match=msg):
Index(["a", "b", "c"], dtype=float)

def test_constructor_unwraps_index(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make this more general by using the indices fixture here.

@@ -628,6 +628,12 @@ def test_constructor_coercion_signed_to_unsigned(self, uint_dtype):
with pytest.raises(OverflowError, match=msg):
Index([-1], dtype=uint_dtype)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than specificying testing a single index, see my above comment.

@TomAugspurger
Copy link
Contributor Author

parametrized over indices (skipping MultiIndex).

@TomAugspurger
Copy link
Contributor Author

All green.

@jorisvandenbossche
Copy link
Member

OK for the xor behaviour (We should maybe consider deprecating it, but that's a separate issue)

@jorisvandenbossche jorisvandenbossche merged commit fb68731 into pandas-dev:master Nov 15, 2018
@jorisvandenbossche
Copy link
Member

@TomAugspurger Thanks!

thoo added a commit to thoo/pandas that referenced this pull request Nov 15, 2018
* upstream/master:
  BUG: to_html misses truncation indicators (...) when index=False (pandas-dev#22786)
  API/DEPR: replace "raise_conflict" with "errors" for df.update (pandas-dev#23657)
  BUG: Append DataFrame to Series with dateutil timezone (pandas-dev#23685)
  CLN/CI: Catch that stderr-warning! (pandas-dev#23706)
  ENH: Allow for join between two multi-index dataframe instances (pandas-dev#20356)
  Ensure Index._data is an ndarray (pandas-dev#23628)
  DOC: flake8-per-pr for windows users (pandas-dev#23707)
  DOC: Handle exceptions when computing contributors. (pandas-dev#23714)
  DOC: Validate space before colon docstring parameters pandas-dev#23483 (pandas-dev#23506)
  BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants