Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for adding unexpected Empty Records in Nested Arrays in BigQueryIO #34102

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

stankiewicz
Copy link
Contributor

fixes #33842

Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @robertwb for label java.
R: @damondouglas for label io.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@stankiewicz stankiewicz changed the title fix for adding unexpected Empty Records in Nested Arrays fix for adding unexpected Empty Records in Nested Arrays in BigQueryIO Feb 28, 2025
@stankiewicz
Copy link
Contributor Author

@ahmedabu98 can you take a look at this? thanks!

@stankiewicz
Copy link
Contributor Author

do not merge yet, I think I found another bug.

Copy link
Contributor

@ahmedabu98 ahmedabu98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Thanks for taking this @stankiewicz

Lmk when it's ready to merge

@stankiewicz
Copy link
Contributor Author

stankiewicz commented Mar 3, 2025

issue I see additional problem here: StorageApiWriteUnshardedRecords

If we have a row with known fields (name:string, surname:string) and one unknown (foo:string) , this line will concat and not cause schema issues.

but if unknown field is part of repeated structure,
e.g. (name:string, phone:(repeated struct type:string, num:string) ) and we have unknown in repeated struct (phone: [ null, {favourite:yes} ]) concat doesn't work well for some reason.

For current bug, it is resolved - it won't add empty records so we can merge it.

Added issue for concatenating unknown nested fields which may require bigger changes #34145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants