Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destinations should support timestamp with millisecond precision #8904

Open
6 of 28 tasks
tuliren opened this issue Dec 20, 2021 · 4 comments
Open
6 of 28 tasks

Destinations should support timestamp with millisecond precision #8904

tuliren opened this issue Dec 20, 2021 · 4 comments
Labels
area/connectors Connector related issues area/databases frozen Not being actively worked on team/destinations Destinations team's backlog

Comments

@tuliren
Copy link
Contributor

tuliren commented Dec 20, 2021

Summary

Currently Postgres source returns timestamp with second precision. This causes problem for timestamp columns with millisecond precision when such column is used as the cursor in the incremental sync. The data persisted on the destination side has second precision, while the original data in the database has millisecond precision. Consequently, the timestamp in the original data is always newer than that synced to the destination because of the extra millisecond values. So the sync is always triggered even when there is is no new data.

Slack thread.

TODOs

  • Here is the implementation plan. Once ready, convert each bullet point into its own issue.
  • Update at least one of the existing DAT timestamp test case to have millisecond precision.
  • Update each destination to support the millisecond precision timestamp. If a destination is incompatible with such timestamp, make sure its DAT does not fail (probably by overriding the assertion to still check for second precision timestamp). It may be necessary to update normalization and dbt code. Some of them may be combined in one project (e.g. some JDBC destinations).
    • Azure Blob Storage
    • BigQuery
    • ClickHouse
    • Cassandra
    • Databricks
    • Elasticsearch
    • GCS
    • Google Firestore
    • Google PubSub
    • Kafka
    • Keen (Charfigy)
    • Local CSV
    • Local JSON
    • MariaDB ColumnStore
    • Mongo DB
    • MQTT
    • Postgres
    • Pulsar
    • Redshift
    • Rockset
    • S3
    • SFTP-JSON
    • Snowflake
  • Make sure all destinations can pass the new DAT test case.
  • Update the following source databases to support millisecond precision timestamp
  • Linked bugs

┆Issue is synchronized with this Asana task by Unito

@tuliren tuliren self-assigned this Dec 20, 2021
@tuliren tuliren removed the blocked label Dec 20, 2021
@tuliren tuliren added this to the ConnCore Dec 22, 2021 milestone Dec 20, 2021
@DoNotPanicUA DoNotPanicUA self-assigned this Jan 6, 2022
@tuliren tuliren removed their assignment Jan 12, 2022
@tuliren tuliren removed this from the ConnCore Jan 5 milestone Jan 12, 2022
@alexandr-shegeda alexandr-shegeda moved this to Ready for implementation in GL Roadmap Jan 12, 2022
@alexandr-shegeda alexandr-shegeda changed the title Postgres source should return timestamp with millisecond precision [EPIC] Postgres source should return timestamp with millisecond precision Jan 12, 2022
@alexandr-shegeda alexandr-shegeda self-assigned this Jan 12, 2022
@alexandr-shegeda alexandr-shegeda moved this from Ready for implementation to Scoping complete in GL Roadmap Jan 27, 2022
@ameyabapat-bsft
Copy link

Do we need to add snowflake-source to Update the following source databases to support millisecond precision timestamp category? I have found the similar issue #9915

@andriikorotkov andriikorotkov moved this from Scoping complete to Implementation in progress in GL Roadmap Feb 1, 2022
@ameyabapat-bsft
Copy link

@alafanechere @tuliren any updates on source snowflake (#9915) ? This issues is magnified in our use cases where large data dump(10-100k rows) is added at source in single operation which make timestamp of lots of rows same and all them are resynced in in next sync operation.

@grishick
Copy link
Contributor

grishick commented Aug 4, 2022

Sources have been fixed. Converting this to destination-specific issue. Next step: create issues for each destination, which is capable of supporting millisecond precision

@grishick grishick removed the Epic label Aug 4, 2022
@grishick grishick changed the title [EPIC] Postgres source should return timestamp with millisecond precision Destinations should support timestamp with millisecond precision Aug 4, 2022
@tuliren
Copy link
Contributor Author

tuliren commented Aug 9, 2022

@alexandr-shegeda, is GL working on the destination changes as well?

@grishick grishick added the team/destinations Destinations team's backlog label Sep 27, 2022
@bleonard bleonard added the frozen Not being actively worked on label Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/databases frozen Not being actively worked on team/destinations Destinations team's backlog
Projects
None yet
Development

No branches or pull requests

10 participants