-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[low code connectors] replace file retrieval with pkgutil to fix getting schema files #15814
Conversation
@@ -33,8 +34,38 @@ def __post_init__(self, options: Mapping[str, Any]): | |||
|
|||
def get_json_schema(self) -> Mapping[str, Any]: | |||
json_schema_path = self._get_json_filepath() | |||
with open(json_schema_path, "r") as f: | |||
return json.loads(f.read()) | |||
resource, schema_path = self.extract_resource_and_schema_path(json_schema_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this solution of splitting the URL. As mentioned in the below comment, this is a slight hack for us to figure out what resource the schemas are mounted under. Because setup.py mounts them in the source_sentry
resource for example, we need to figure out from the airbyte_cdk
module what the original source name was. All existing configs already prefix the source name so we can extract it from the first part of the path.
An alternative that I thought about was adding a new field to the JsonSchema dataclass called resource. However, I didn't like that because it will in 100% of cases always end up being the same as the name of the connector. It's also a weak abstraction and exposes our underlying file parsing mechanism by breaking it into two parts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create an issue with the context form our Slack DM and link it as a TODO comment in the code?
yup will do! |
/publish-cdk dry-run=true
|
/publish-cdk dry-run=false
|
…ing schema files (#15814) * replace file retrieval with pkgutil to fix getting schema files * slightly better error handling on missing files * filter our schema gen warnings for some classes that cannot generate schemas * add comment for todo * add changelog and setup before publish
What
Similar to the bug that we can't read files from the running Docker container. We need to read them using pkgutils when they're accessible at runtime using
package_data
in setup.py . The same applies for schema files.This change should not require any changes to the existing yaml configs and I tested this locally for each of the integrations currently running in prod.
Example format of the json schema path:
How
Same concept as reading yaml configs. However, for slightly better DX, we should accept file structure that supports
.
prefix,/
prefix, and the start of resource all the same so we do some string manipulation. All the existing declarative configs in prod at the moment shouldn't be affected by this changeNote: I also added a small fix to filter out schema gen warnings for some builtin types that would get flagged by the dataclasses_json helper library. The annoying part about these is they get flagged by the platform as ERROR which is confusing to see when debugging and could lead to a lot of red herrings during OC. We should hide these. An example of the warnings: