Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rust): on write, have a schema evolution maintain metadata table id #3275

Conversation

liamphmurphy
Copy link
Contributor

@liamphmurphy liamphmurphy commented Feb 27, 2025

Description

When a schema evolution write occurs, if a metadata ID from the table state exists, use that in the new metadata transaction entry.

This PR has a unit test and a new integration test that runs a Pyspark stream between evolution runs. I confirmed that the integration test failed with the error I've seen in production when I removed the new logic, and passes with the new logic.

Related Issue(s)

closes #3274

Documentation

@liamphmurphy liamphmurphy changed the title on a schema evolution, if a previous table metadata id, persist in ne… fix(rust): on a schema evolution, if a previous table metadata id, persist in ne… Feb 27, 2025
@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Feb 27, 2025
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@liamphmurphy liamphmurphy changed the title fix(rust): on a schema evolution, if a previous table metadata id, persist in ne… fix(rust): on a schema evolution, maintain metadata table id Feb 27, 2025
rtyler
rtyler previously approved these changes Feb 27, 2025
@rtyler rtyler changed the title fix(rust): on a schema evolution, maintain metadata table id fix(rust): on a schema evolutionmaintain metadata table id Feb 27, 2025
@rtyler rtyler changed the title fix(rust): on a schema evolutionmaintain metadata table id fix(rust): on a schema evolution maintain metadata table id Feb 27, 2025
@rtyler rtyler force-pushed the fix/maintain_table_id_through_schema_evolution branch from 20292a8 to e97466c Compare February 27, 2025 19:29
@rtyler rtyler enabled auto-merge February 27, 2025 19:30
@rtyler rtyler changed the title fix(rust): on a schema evolution maintain metadata table id fix(rust): on a schema evolution maintain metadata table id Feb 27, 2025
auto-merge was automatically disabled February 27, 2025 19:59

Head branch was pushed to by a user without write access

@liamphmurphy liamphmurphy requested a review from rtyler February 27, 2025 20:04
@rtyler rtyler self-assigned this Feb 27, 2025
@github-actions github-actions bot added the binding/python Issues for the Python package label Feb 27, 2025
@liamphmurphy liamphmurphy force-pushed the fix/maintain_table_id_through_schema_evolution branch from 758b6cf to 0b3b218 Compare February 27, 2025 22:49
@liamphmurphy liamphmurphy changed the title fix(rust): on a schema evolution maintain metadata table id fix(rust): on write, have a schema evolution maintain metadata table id Feb 27, 2025
Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 72.09%. Comparing base (94a2009) to head (1a3f32e).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/core/src/operations/write/mod.rs 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3275      +/-   ##
==========================================
- Coverage   72.11%   72.09%   -0.02%     
==========================================
  Files         143      143              
  Lines       45530    45537       +7     
  Branches    45530    45537       +7     
==========================================
- Hits        32833    32830       -3     
- Misses      10618    10624       +6     
- Partials     2079     2083       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ion-elgreco
Copy link
Collaborator

@liamphmurphy can you squash the commits and fix the CI?

@rtyler rtyler force-pushed the fix/maintain_table_id_through_schema_evolution branch from e715999 to 3db01ed Compare February 28, 2025 14:51
rtyler
rtyler previously approved these changes Feb 28, 2025
@rtyler rtyler enabled auto-merge February 28, 2025 14:52
…in new metadata transaction

Signed-off-by: Liam Murphy <liam@phmurphy.com>
@rtyler rtyler force-pushed the fix/maintain_table_id_through_schema_evolution branch from 3db01ed to 1a3f32e Compare February 28, 2025 14:58
@rtyler rtyler added this pull request to the merge queue Feb 28, 2025
Merged via the queue into delta-io:main with commit f495ee6 Feb 28, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Schema evolution causing table ID to be regenerated, breaks Spark streaming jobs
3 participants