Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema evolution causing table ID to be regenerated, breaks Spark streaming jobs #3274

Closed
liamphmurphy opened this issue Feb 27, 2025 · 0 comments · Fixed by #3275
Closed
Labels
bug Something isn't working

Comments

@liamphmurphy
Copy link
Contributor

Environment

Delta-rs version: 0.25.2

Binding: Python

Environment:
AWS, Local


Bug

What happened:

When a schema evolution write is triggered, we generate a whole new metadata line to represent the state of the new schema, which makes sense. But we're also regenerating the table's ID, which has negative consequences on Spark streaming jobs that use the metadata ID to tell which table is being read. When this changes, the Spark streaming read thinks that this table is a whole new one.

This net new metadata struct and ID is generated here:

let metadata = Metadata::try_new(schema, part_cols, HashMap::new())?;

What you expected to happen:

A new metadata struct is generated, but the metadata ID from the table's existing state is utilized instead.

How to reproduce it:

Perform any schema evolution write on a table and compare the original metadata.id to the new one.

More details:

I'm tinkering locally and think I have a fix made, but am also not a Rust expert, so if anyone has strong opinions on how to implement feel free to override me.

@liamphmurphy liamphmurphy added the bug Something isn't working label Feb 27, 2025
@liamphmurphy liamphmurphy changed the title Schema evolution causing table ID to be rewritten Schema evolution causing table ID to be regenerated, breaks Spark streaming jobs Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant