Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better logging when row_to_singer_message fails #79

Open
idris opened this issue Feb 26, 2020 · 0 comments
Open

Better logging when row_to_singer_message fails #79

idris opened this issue Feb 26, 2020 · 0 comments

Comments

@idris
Copy link

idris commented Feb 26, 2020

We ran into an issue where we had an invalid timestamp (year 21206) in our database and Python couldn't handle it. The error thrown was pretty generic (see logs below). It didn't tell me what table or row the bad timestamp was in. Eventually, I thought to peek into the WAL log using pg_logical_slot_peek_changes to see if I could find it, and I was able to spot it pretty quickly.

I have two suggestions here:

  1. Wrap selected_value_to_singer_value in a try/except and output a useful error message detailing the table and log entry that caused the failure.
  2. The error message should suggest using pg_logical_slot_peek_changes as a way to troubleshoot and find the bad data.
2020-02-26 21:38:34,995Z    tap - CRITICAL year is out of range: 21202-02-26 00:00:00+00
2020-02-26 21:38:34,998Z    tap - Traceback (most recent call last):
2020-02-26 21:38:34,998Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 655, in parse
2020-02-26 21:38:34,998Z    tap -     ret = self._build_naive(res, default)
2020-02-26 21:38:34,998Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 1241, in _build_naive
2020-02-26 21:38:34,998Z    tap -     naive = default.replace(**repl)
2020-02-26 21:38:34,998Z    tap - ValueError: year is out of range
2020-02-26 21:38:34,998Z    tap - 
2020-02-26 21:38:34,999Z    tap - The above exception was the direct cause of the following exception:
2020-02-26 21:38:34,999Z    tap - 
2020-02-26 21:38:34,999Z    tap - Traceback (most recent call last):
2020-02-26 21:38:34,999Z    tap -   File "tap-env/bin/tap-postgres", line 11, in <module>
2020-02-26 21:38:34,999Z    tap -     load_entry_point('tap-postgres==0.0.67', 'console_scripts', 'tap-postgres')()
2020-02-26 21:38:34,999Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 708, in main
2020-02-26 21:38:34,999Z    tap -     raise exc
2020-02-26 21:38:34,999Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 705, in main
2020-02-26 21:38:34,999Z    tap -     main_impl()
2020-02-26 21:38:34,999Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 699, in main_impl
2020-02-26 21:38:34,999Z    tap -     do_sync(conn_config, args.properties, args.config.get('default_replication_method'), state)
2020-02-26 21:38:34,999Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 674, in do_sync
2020-02-26 21:38:34,999Z    tap -     state = sync_logical_streams(conn_config, list(streams), state, end_lsn)
2020-02-26 21:38:34,999Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/__init__.py", line 584, in sync_logical_streams
2020-02-26 21:38:34,999Z    tap -     state = logical_replication.sync_tables(conn_config, logical_streams, state, end_lsn)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 361, in sync_tables
2020-02-26 21:38:35,000Z    tap -     state = consume_message(logical_streams, state, msg, time_extracted, conn_info, end_lsn)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 269, in consume_message
2020-02-26 21:38:35,000Z    tap -     record_message = row_to_singer_message(target_stream, col_vals, stream_version, col_names, time_extracted, stream_md_map, conn_info)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 209, in row_to_singer_message
2020-02-26 21:38:35,000Z    tap -     cleaned_elem = selected_value_to_singer_value(elem, sql_datatype, conn_info)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 195, in selected_value_to_singer_value
2020-02-26 21:38:35,000Z    tap -     return selected_value_to_singer_value_impl(elem, sql_datatype, conn_info)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_postgres/sync_strategies/logical_replication.py", line 156, in selected_value_to_singer_value_impl
2020-02-26 21:38:35,000Z    tap -     return parse(elem).isoformat()
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 1374, in parse
2020-02-26 21:38:35,000Z    tap -     return DEFAULTPARSER.parse(timestr, **kwargs)
2020-02-26 21:38:35,000Z    tap -   File "/code/orchestrator/tap-env/lib/python3.5/site-packages/dateutil/parser/_parser.py", line 657, in parse
2020-02-26 21:38:35,000Z    tap -     six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
2020-02-26 21:38:35,000Z    tap -   File "<string>", line 3, in raise_from
2020-02-26 21:38:35,000Z    tap - dateutil.parser._parser.ParserError: year is out of range: 21202-02-26 00:00:00+00
2020-02-26 21:38:35,035Z target - INFO Requests complete, stopping loop
2020-02-26 21:38:35,085Z   main - INFO Target exited normally with status 0
2020-02-26 21:38:35,087Z   main - INFO Closed tunnel and deleted temporary keyfile
2020-02-26 21:38:35,087Z   main - INFO Exit status is: Discovery succeeded. Tap failed with code 1 and error message: "year is out of range: 21202-02-26 00:00:00+00". Target succeeded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant