Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate errors='ignore' in to_datetime and make output dtype predictable #54467

Closed
1 of 3 tasks
MarcoGorelli opened this issue Aug 9, 2023 · 12 comments · Fixed by #55734
Closed
1 of 3 tasks
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas Enhancement

Comments

@MarcoGorelli
Copy link
Member

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

currently, if you do to_datetime(inputs), you don't really know what the dtype of the output will be. It could be Index or DatetimeIndex

Deprecating parsing mixed offsets goes part of the way to addressing this

Feature Description

Can we go all the way there, and deprecate errors='ignore'? Then, if the computation succeeds, then you get DatetimeIndex (potentially with some NaTs if errors='coerce')

Alternative Solutions

none that I can think of

Additional Context

there's talk about query optimisation in pandas, and greater predictability should help with that

@MarcoGorelli MarcoGorelli added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member Deprecate Functionality to remove in pandas Datetime Datetime data dtype and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 9, 2023
@jbrockmendel
Copy link
Member

+1

@mroeschke
Copy link
Member

Likewise for to_timedelta and to_numeric

@Cyddharth-Gupta
Copy link

@MarcoGorelli I am willing to take up this issue but I was hoping if I can get more info into this, if we deprecate errors='ignore' then shall it return a warning that it is deprecated?

@MarcoGorelli
Copy link
Member Author

yup, that's right!

@Cyddharth-Gupta
Copy link

Cyddharth-Gupta commented Aug 10, 2023

okay, then I will take up this issue.

@Cyddharth-Gupta
Copy link

take

@SteffenMeinecke
Copy link

@MarcoGorelli @Cyddharth-Gupta For my applications, errors="ignore" is what I need. What is the recommended code for pandas 3.0 to get previous behavior?
Consider the following example:

df1 = pd.DataFrame([[1, 1, 2, 4, 6], [7, 7, 5., 4.1, "h"], [None, 3, 5.2, 4, "ghjh"]])
print(df1.dtypes)
# code instead of `df2 = df1.apply(pd.to_numeric, errors='ignore')`
print(df2.dtypes)

and the expected output:

0 object
1 object
2 object
3 object
4 object
dtype: object

0 float64
1 int64
2 float64
3 float64
4 object
dtype: object

@MarcoGorelli
Copy link
Member Author

that's a different function, this issue is just about to_Datetime

@SteffenMeinecke
Copy link

Thanks @MarcoGorelli for replying. Since @mroeschke mentioned that it is the same for to_numeric(), I thought it would be this issue. Can you guide me to the right one?

@dbrtly
Copy link

dbrtly commented Jun 4, 2024

what is the recommended refactor for a line of code like this that gets a warning in pytest?

expected[col] = pd.to_datetime(expected[col], errors="ignore").dt.time

@MarcoGorelli
Copy link
Member Author

i'd suggest to wrap it in try-except

@MohitBurkule
Copy link

@MarcoGorelli @Cyddharth-Gupta For my applications, errors="ignore" is what I need. What is the recommended code for pandas 3.0 to get previous behavior? Consider the following example:

df1 = pd.DataFrame([[1, 1, 2, 4, 6], [7, 7, 5., 4.1, "h"], [None, 3, 5.2, 4, "ghjh"]])
print(df1.dtypes)
# code instead of `df2 = df1.apply(pd.to_numeric, errors='ignore')`
print(df2.dtypes)

and the expected output:

0 object 1 object 2 object 3 object 4 object dtype: object

0 float64 1 int64 2 float64 3 float64 4 object dtype: object

The below might work

df2 = df1.apply(pd.to_numeric, errors='coerce')
df2=df2.fillna(df1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas Enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants