Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: silently ignoring unrecognized timezones #51477

Merged
merged 3 commits into from
Feb 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -838,6 +838,7 @@ Other API changes
Deprecations
~~~~~~~~~~~~
- Deprecated parsing datetime strings with system-local timezone to ``tzlocal``, pass a ``tz`` keyword or explicitly call ``tz_localize`` instead (:issue:`50791`)
- Deprecated silently dropping unrecognized timezones when parsing strings to datetimes (:issue:`18702`)
- Deprecated argument ``infer_datetime_format`` in :func:`to_datetime` and :func:`read_csv`, as a strict version of it is now the default (:issue:`48621`)
- Deprecated behavior of :func:`to_datetime` with ``unit`` when parsing strings, in a future version these will be parsed as datetimes (matching unit-less behavior) instead of cast to floats. To retain the old behavior, cast strings to numeric types before calling :func:`to_datetime` (:issue:`50735`)
- Deprecated :func:`pandas.io.sql.execute` (:issue:`50185`)
Expand Down
25 changes: 24 additions & 1 deletion pandas/_libs/tslibs/parsing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -722,6 +722,18 @@ cdef datetime dateutil_parse(
f'Parsed string "{timestr}" gives an invalid tzoffset, '
"which must be between -timedelta(hours=24) and timedelta(hours=24)"
)
elif res.tzname is not None:
# e.g. "1994 Jan 15 05:16 FOO" where FOO is not recognized
# GH#18702
warnings.warn(
f'Parsed string "{timestr}" included an un-recognized timezone '
f'"{res.tzname}". Dropping unrecognized timezones is deprecated; '
"in a future version this will raise. Instead pass the string "
"without the timezone, then use .tz_localize to convert to a "
"recognized timezone.",
FutureWarning,
stacklevel=find_stack_level()
)

out_bestunit[0] = attrname_to_npy_unit[reso]
return ret
Expand Down Expand Up @@ -865,6 +877,8 @@ def guess_datetime_format(dt_str: str, bint dayfirst=False) -> str | None:
datetime format string (for `strftime` or `strptime`),
or None if it can't be guessed.
"""
cdef:
NPY_DATETIMEUNIT out_bestunit
day_attribute_and_format = (("day",), "%d", 2)

# attr name, format, padding (if any)
Expand Down Expand Up @@ -895,8 +909,17 @@ def guess_datetime_format(dt_str: str, bint dayfirst=False) -> str | None:
datetime_attrs_to_format.remove(day_attribute_and_format)
datetime_attrs_to_format.insert(0, day_attribute_and_format)

# same default used by dateutil
default = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
try:
parsed_datetime = du_parse(dt_str, dayfirst=dayfirst)
parsed_datetime = dateutil_parse(
dt_str,
default=default,
dayfirst=dayfirst,
yearfirst=False,
ignoretz=False,
out_bestunit=&out_bestunit,
)
except (ValueError, OverflowError, InvalidOperation):
# In case the datetime can't be parsed, its format cannot be guessed
return None
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -3595,3 +3595,19 @@ def test_to_datetime_mixed_not_necessarily_iso8601_coerce(errors, expected):
# https://github.com/pandas-dev/pandas/issues/50411
result = to_datetime(["2020-01-01", "01-01-2000"], format="ISO8601", errors=errors)
tm.assert_index_equal(result, expected)


def test_ignoring_unknown_tz_deprecated():
# GH#18702, GH#51476
dtstr = "2014 Jan 9 05:15 FAKE"
msg = 'un-recognized timezone "FAKE". Dropping unrecognized timezones is deprecated'
with tm.assert_produces_warning(FutureWarning, match=msg):
res = Timestamp(dtstr)
assert res == Timestamp(dtstr[:-5])

with tm.assert_produces_warning(FutureWarning):
res = to_datetime(dtstr)
assert res == to_datetime(dtstr[:-5])
with tm.assert_produces_warning(FutureWarning):
res = to_datetime([dtstr])
tm.assert_index_equal(res, to_datetime([dtstr[:-5]]))