-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and clean up json avro schema converter #9363
Conversation
obtainPaths(arrayPath, arrayNode.get(i), jsonNodePathMap); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method proactively calculate all paths to generate the namespaces. However, this is not necessary. The namespaces can be computed while travesing the Json schema.
@@ -1216,5 +1168,226 @@ | |||
"identifier": ["151", "152", "true", "{\"id\":153}"], | |||
"_airbyte_additional_properties": null | |||
} | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above case is trying to test objects with same names. However, it had lots of irrelevant fields. Those fields have been removed to make the test case easier to read.
/test connector=connectors/destination-s3
|
/test connector=connectors/destination-gcs
|
@@ -161,6 +161,149 @@ This is not supported in Avro schema. As a compromise, the converter creates a u | |||
} | |||
``` | |||
|
|||
If the Json array has multiple object items, these objects will be recursively merged into one Avro record. For example, the following Json array expects two different objects, each with a different `id` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this doc be in undertanding airbyte? or is it more appropriately scoped for the S3/blob storage connectors docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This avro doc was previously inline in the S3 and other blob storage connector docs. But as it gets longer and longer, I don't want to duplicate this doc in multiple places anymore. And In the future, more connectors using Avro / Parquet staging files will be related to this topic. So I think keeping everything here is the way to go. All relevant connector docs alreadt have links to this doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that the wording is slightly off. I have updated the doc in a second Avro PR: #9367.
) * Support array field with empty items specification * Remove all exceptions * Format code * Bump connector versions * Bump bigquery versions * Update docs * Remove unused code * Update doc for PR #9363 * Update doc about defaulting all improperly typed fields to string * Ignore bigquery * Update version and doc * Update doc * Bump version in seed
What