-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Destination Snowflake: duplicate rows on retries when using incremental staging #8832
Comments
@joshuataylor if i understand correctly the problem is that all files are loaded from the stage, rather than loading files from this particular sync correct? |
Correct, so the files are in the stage, so when a sync fails and retries it adds new files to the stage, which will be duplicated from retry 1 and retry 2. |
@joshuataylor please advise how to cancel query in Snowflake? I believe we should know query id for that |
@joshuataylor please ignore my previous comment. Already found needed approach |
Environment
Current Behavior
When creating a new sync, if the sync fails and it has to retry, all rows which have already been put on the stage will then have rows appended again to the stage. So there are duplicate rows.
Expected Behavior
It should not have duplicate rows.
The stage should have a folder in it, as mentioned here https://docs.snowflake.com/en/user-guide/data-load-local-file-system-stage.html:
This way the uuid is used just for that sync, and other retries should then use a new UUID. On failure it should delete files from that uuid.
Attempt 1:
Attempt 2:
Steps to Reproduce
Are you willing to submit a PR?
Maybe?
The text was updated successfully, but these errors were encountered: