You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We ran into an issue where we had an invalid timestamp (year 21206) in our database and Python couldn't handle it. The error thrown was pretty generic (see logs below). It didn't tell me what table or row the bad timestamp was in. Eventually, I thought to peek into the WAL log using pg_logical_slot_peek_changes to see if I could find it, and I was able to spot it pretty quickly.
I have two suggestions here:
Wrap selected_value_to_singer_value in a try/except and output a useful error message detailing the table and log entry that caused the failure.
The error message should suggest using pg_logical_slot_peek_changes as a way to troubleshoot and find the bad data.
2020-02-26 21:38:34,995Z tap - CRITICAL year is out of range: 21202-02-26 00:00:00+00
2020-02-26 21:38:34,998Z tap - Traceback (most recent call last):
2020-02-26 21:38:34,998Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 655, in parse
2020-02-26 21:38:34,998Z tap - ret = self._build_naive(res, default)
2020-02-26 21:38:34,998Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
2020-02-26 21:38:34,998Z tap - naive = default.replace(**repl)
2020-02-26 21:38:34,998Z tap - ValueError: year is out of range
2020-02-26 21:38:34,998Z tap -
2020-02-26 21:38:34,999Z tap - The above exception was the direct cause of the following exception:
2020-02-26 21:38:34,999Z tap -
2020-02-26 21:38:34,999Z tap - Traceback (most recent call last):
2020-02-26 21:38:34,999Z tap - File "tap-env/bin/tap-postgres", line 11, in <module>
2020-02-26 21:38:34,999Z tap - load_entry_point('tap-postgres==0.0.67', 'console_scripts', 'tap-postgres')()
2020-02-26 21:38:34,999Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 708, in main
2020-02-26 21:38:34,999Z tap - raise exc
2020-02-26 21:38:34,999Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 705, in main
2020-02-26 21:38:34,999Z tap - main_impl()
2020-02-26 21:38:34,999Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 699, in main_impl
2020-02-26 21:38:34,999Z tap - do_sync(conn_config, args.properties, args.config.get('default_replication_method'), state)
2020-02-26 21:38:34,999Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 674, in do_sync
2020-02-26 21:38:34,999Z tap - state = sync_logical_streams(conn_config, list(streams), state, end_lsn)
2020-02-26 21:38:34,999Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 584, in sync_logical_streams
2020-02-26 21:38:34,999Z tap - state = logical_replication.sync_tables(conn_config, logical_streams, state, end_lsn)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 361, in sync_tables
2020-02-26 21:38:35,000Z tap - state = consume_message(logical_streams, state, msg, time_extracted, conn_info, end_lsn)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 269, in consume_message
2020-02-26 21:38:35,000Z tap - record_message = row_to_singer_message(target_stream, col_vals, stream_version, col_names, time_extracted, stream_md_map, conn_info)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 209, in row_to_singer_message
2020-02-26 21:38:35,000Z tap - cleaned_elem = selected_value_to_singer_value(elem, sql_datatype, conn_info)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 195, in selected_value_to_singer_value
2020-02-26 21:38:35,000Z tap - return selected_value_to_singer_value_impl(elem, sql_datatype, conn_info)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 156, in selected_value_to_singer_value_impl
2020-02-26 21:38:35,000Z tap - return parse(elem).isoformat()
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 1374, in parse
2020-02-26 21:38:35,000Z tap - return DEFAULTPARSER.parse(timestr, **kwargs)
2020-02-26 21:38:35,000Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 657, in parse
2020-02-26 21:38:35,000Z tap - six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
2020-02-26 21:38:35,000Z tap - File "<string>", line 3, in raise_from
2020-02-26 21:38:35,000Z tap - dateutil.parser._parser.ParserError: year is out of range: 21202-02-26 00:00:00+00
2020-02-26 21:38:35,035Z target - INFO Requests complete, stopping loop
2020-02-26 21:38:35,085Z main - INFO Target exited normally with status 0
2020-02-26 21:38:35,087Z main - INFO Closed tunnel and deleted temporary keyfile
2020-02-26 21:38:35,087Z main - INFO Exit status is: Discovery succeeded. Tap failed with code 1 and error message: "year is out of range: 21202-02-26 00:00:00+00". Target succeeded.
The text was updated successfully, but these errors were encountered:
We ran into an issue where we had an invalid timestamp (year 21206) in our database and Python couldn't handle it. The error thrown was pretty generic (see logs below). It didn't tell me what table or row the bad timestamp was in. Eventually, I thought to peek into the WAL log using
pg_logical_slot_peek_changes
to see if I could find it, and I was able to spot it pretty quickly.I have two suggestions here:
pg_logical_slot_peek_changes
as a way to troubleshoot and find the bad data.The text was updated successfully, but these errors were encountered: