-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: (CDK) (AsyncRetriever) - Improve UX on variable naming and interpolation #368
fix: (CDK) (AsyncRetriever) - Improve UX on variable naming and interpolation #368
Conversation
📝 WalkthroughWalkthroughThis pull request updates the asynchronous job handling by enhancing the declarative schema and renaming key extractor references. Three new fields— Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant JobRepository
participant JobServer
participant Extractor
Client->>JobRepository: Initiate async job request
JobRepository->>JobServer: Send creation request
JobServer-->>JobRepository: Return creation_response
JobRepository->>JobServer: Poll using creation_response
JobServer-->>JobRepository: Return polling_response
JobRepository->>Extractor: Extract download target from polling_response
Extractor-->>JobRepository: Return download_target
Possibly related PRs
Suggested labels
Suggested reviewers
Tip CodeRabbit's docstrings feature is now available as part of our Pro Plan! Simply use the command ✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/requesters/http_requester.py (1)
128-134
: Enhanced interpolation context with extra fields from stream_sliceThis change allows the path generator to utilize additional contextual information from the stream_slice's extra_fields, which enables more flexible path generation. This supports the improved variable naming and interpolation UX goals of the PR.
Would adding a debug log message when extra fields are included in the interpolation context be helpful for troubleshooting? Something like
LOGGER.debug(f"Including extra fields in interpolation context: {stream_slice.extra_fields}")
wdyt?airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
3673-3687
: Documentation added for new interpolation variablesThe documentation for the new interpolation variables is thorough and includes helpful examples. This will make it easier for users to understand how to use these variables in their configurations. One small thing - would the
download_target
examples be better without the URL prefix in the JSON examples? Something like"https://example.com/download/file.csv"
instead ofurl: "https..."
, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(4 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(2 hunks)airbyte_cdk/sources/declarative/requesters/README.md
(2 hunks)airbyte_cdk/sources/declarative/requesters/http_job_repository.py
(4 hunks)airbyte_cdk/sources/declarative/requesters/http_requester.py
(2 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(3 hunks)unit_tests/sources/declarative/requesters/test_http_job_repository.py
(3 hunks)unit_tests/sources/declarative/test_concurrent_declarative_source.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- airbyte_cdk/sources/declarative/models/declarative_component_schema.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (24)
airbyte_cdk/sources/declarative/requesters/README.md (3)
4-5
: Updated naming convention for better semantic clarityThe change from
urls_extractor
todownload_target_extractor
improves the clarity of the component's purpose, making it more intuitive for users to understand that this extractor is specifically responsible for retrieving download targets from the polling response.
28-28
: Simplified interpolation context for polling requesterThe use of
creation_response
in the interpolation context makes the path more intuitive and aligns with the breaking changes described in the PR objectives.
35-35
: Improved interpolation context for URL requesterNow using
polling_response
in the interpolation context, which is more clear and better reflects the data source.airbyte_cdk/sources/declarative/requesters/http_requester.py (1)
88-88
: Added type ignore comment for backoff_strategiesThe type ignore comment addresses type checking issues with the backoff_strategies attribute. This is a safe addition that prevents type checking errors without affecting runtime behavior.
unit_tests/sources/declarative/test_concurrent_declarative_source.py (2)
302-302
: Updated component name to download_target_extractorThis change aligns with the renaming in other parts of the codebase, ensuring consistency with the new naming convention.
310-310
: Simplified path reference using creation_responseThe path for polling_requester has been updated to use
creation_response['id']
directly, which is more straightforward than the previous complex reference to stream_slice. This improves readability and matches the PR objectives.airbyte_cdk/sources/declarative/requesters/http_job_repository.py (5)
46-46
: Renamed attribute to download_target_extractorThis change is consistent with the naming updates in other files and better reflects the component's purpose.
219-223
: Updated extra_fields to use download_targetThe field name has been changed from 'url' to 'download_target', which aligns with the new naming convention and improves consistency throughout the codebase.
275-280
: Enhanced stream slice with creation_responseNow extracting and including the creation_response in the extra_fields of the StreamSlice, which allows for more direct referencing in path interpolation.
287-292
: Updated stream slice with polling_responseSimilar to the creation_response change, this modification extracts the polling_response and includes it in the extra_fields, improving the UX for variable naming and interpolation.
300-300
: Updated method call to use download_target_extractorThis change completes the renaming of the urls_extractor to download_target_extractor, ensuring consistency across the codebase.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
2760-2762
: Rename from 'urls_extractor' to 'download_target_extractor' looks good!The variable has been consistently renamed to match the PR objectives, which improves naming clarity by better reflecting what this component actually extracts (a download target rather than just URLs).
2772-2773
: Consistent parameter rename in job_repository creationThe parameter name has been properly updated in the
AsyncHttpJobRepository
instantiation to use the renamed component, completing the renaming fromurls_extractor
todownload_target_extractor
.airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
1782-1784
: New interpolation variables added for AsyncRetriever UX improvementThese new variables (
creation_response
,polling_response
, anddownload_target
) provide more explicit and direct access to the data needed for path templating. This simplifies variable naming and improves clarity compared to the previous nested approach usingstream_slice
. Nice improvement to the developer experience!
3229-3229
: Component renamed fromurls_extractor
todownload_target_extractor
This rename makes the component's purpose more clear - it's extracting the download target rather than generic URLs. The name better aligns with its actual functionality, improving readability and maintainability. Good change!
unit_tests/sources/declarative/requesters/test_http_job_repository.py (3)
72-72
: Path updated to use newcreation_response
variableThe path has been updated to use the new
creation_response['id']
variable instead of the more verbosestream_slice['create_job_response'].json()['id']
. This aligns with the PR objectives and makes the code more readable.
87-87
: Path updated to use newdownload_target
variableThe path has been updated to use the new
download_target
variable directly rather than accessing it throughstream_slice.extra_fields['url']
. This simplifies the template and improves clarity.
146-146
: Parameter renamed fromurls_extractor
todownload_target_extractor
The parameter has been renamed to match the component name changes in the schema file. This ensures consistency across the codebase and better reflects the component's purpose.
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (6)
3610-3610
: The rename from urls_extractor to download_target_extractor improves clarity.This change makes the property name more descriptive and clearly conveys its purpose - extracting the download target rather than generic "urls". This aligns with the PR objectives to improve variable naming.
3618-3618
: Path interpolation simplified with creation_response.The path interpolation now uses
{{creation_response['id'] }}
instead of what was likely a more complex nested structure like{{stream_slice['create_job_response'].json()['id'] }}
. This is more readable and aligns with the PR objectives to improve UX on variable interpolation.
3638-3638
: Direct download_target reference improves readability.Using
{{download_target}}
instead of a more complex structure like{{stream_slice.extra_fields['url']}}
makes the template more intuitive and straightforward, wdyt?
3644-3644
: Consistent use of download_target in abort_requester path.The abort_requester path now uses the same simplified
download_target
variable pattern as the download_requester, creating consistency across the code.
3650-3650
: Simplified variable usage in delete_requester path.The delete_requester path follows the same pattern improvement as the download and abort requesters, using the clearer
download_target
variable.
3684-3684
: Updated assertion to match renamed property.The assertion property name has been updated to match the renamed property from urls_extractor to download_target_extractor, ensuring tests remain consistent with the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had one comment, otherwise the declarative_component_schema changes look good to me! I'll leave the python changes to @maxi297 to review
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking because it seems functionally good, mostly renaming.
Should we rename url_requester
to download_target_requester
? Also, was there a plan to update source-amazon-ads, source-amazon-seller-partner, source-salesforce and source-sendgrid?
airbyte_cdk/sources/declarative/requesters/http_job_repository.py
Outdated
Show resolved
Hide resolved
I wasn't planning to do it myself, but this is definitely need to be done. |
I think we can do this within this PR as well, on it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/requesters/http_job_repository.py (1)
283-294
: Enhanced_get_download_targets
method with more explicit response handling.The refactored method now clearly captures and passes the polling response in the extra_fields, making it available for interpolation. One minor suggestion: should we add a type annotation for the return type (e.g.,
-> Iterable[str]
) to further enhance code clarity? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(5 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(2 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/requesters/README.md
(3 hunks)airbyte_cdk/sources/declarative/requesters/http_job_repository.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- airbyte_cdk/sources/declarative/requesters/README.md
- airbyte_cdk/sources/declarative/models/declarative_component_schema.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (11)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
2747-2756
: Rename from url_requester to download_target_requester looks good!This change aligns well with the PR objective to improve variable naming. The new name more clearly communicates the purpose of this component for handling download targets rather than just URLs, which should make the code more intuitive to work with.
2760-2762
: Rename from urls_extractor to download_target_extractor is consistent.This change maintains naming consistency with the other renamed variables and clearly communicates that this component extracts download targets, not just URLs. Good improvement!
2767-2773
: Constructor parameter names updated consistently.You've correctly updated all references in the job_repository initialization to match the new variable names. This maintains consistency throughout the code.
2747-2773
: Reminder: This is a breaking change per PR description.As noted in the PR objectives, this is a breaking change for customers using AsyncRetriever. Users will need to update their interpolation method to utilize
creation_response
instead ofstream_slice
for polling_requester.path anddownload_target
instead ofstream_slice
for download_requester.path.Does this change need any additional documentation updates beyond what's in the PR description to assist users with migration? If needed, we could provide some configuration examples showing before/after for common use cases, wdyt?
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3)
1782-1784
: Nice addition of interpolation contexts for the AsyncRetriever component.These new interpolation contexts (
creation_response
,polling_response
, anddownload_target
) directly align with the breaking changes mentioned in the PR objectives, making the path variable configuration more intuitive. This enhances clarity by providing direct access to specific responses rather than digging through nestedstream_slice
structures.
3246-3247
: Renaming fromurls_extractor
todownload_target_extractor
improves clarity.This renaming better describes the component's actual purpose - extracting the final download target from the completed job response. The more precise terminology will make it easier for users to understand the component's function.
3673-3687
: Well-documented new interpolation contexts with helpful examples.The documentation clearly explains each new interpolation context with appropriate examples. The example for
download_target
is particularly good as it shows a realistic URL structure that users might encounter.airbyte_cdk/sources/declarative/requesters/http_job_repository.py (4)
46-47
: Renamed attributes align with schema changes for better consistency.The renaming from
urls_extractor
todownload_target_extractor
andurl_requester
todownload_target_requester
maintains consistency with the changes made in the schema definition. This approach makes the codebase more maintainable by using consistent naming across all files.Also applies to: 52-54
214-223
: Updated method call and variable name infetch_records
for consistency.The changes from
_get_download_urls
to_get_download_targets
and updating the extra_fields to use"download_target"
instead of"url"
reflect the renaming pattern throughout the codebase. This provides a better UX by using more descriptive variable names.
275-280
: Improved data structure in_get_create_job_stream_slice
with explicit response capture.Extracting and including the creation response in the extra_fields makes it directly accessible for interpolation in downstream components. This is a cleaner approach than relying on the nested structure within stream_slice.
296-300
: Updated error message and extraction logic to match new naming.The error message now correctly references
download_target_requester
instead ofurl_requester
, and the extraction usesdownload_target_extractor
for consistency. This code is now well-aligned with the renaming pattern throughout the codebase.
* main: fix: update cryptography package to latest version to address CVE (airbytehq#377) fix: (CDK) (HttpRequester) - Make the `HttpRequester.path` optional (airbytehq#370) feat: improved custom components handling (airbytehq#350) feat: add microseconds timestamp format (airbytehq#373) fix: Replace Unidecode with anyascii for permissive license (airbytehq#367) feat: add IncrementingCountCursor (airbytehq#346) feat: (low-code cdk) datetime format with milliseconds (airbytehq#369) fix: (CDK) (AsyncRetriever) - Improve UX on variable naming and interpolation (airbytehq#368) fix: (CDK) (AsyncRetriever) - Add the `request` and `response` to each `async` operations (airbytehq#356) fix: (CDK) (ConnectorBuilder) - Add `auxiliary requests` to slice; support `TestRead` for AsyncRetriever (part 1/2) (airbytehq#355) feat(concurrent perpartition cursor): Add parent state updates (airbytehq#343) fix: update csv parser for builder compatibility (airbytehq#364) feat(low-code cdk): add interpolation for limit field in Rate (airbytehq#353) feat(low-code cdk): add AbstractStreamFacade processing as concurrent streams in declarative source (airbytehq#347) fix: (CDK) (CsvParser) - Fix the `\\` escaping when passing the `delimiter` from Builder's UI (airbytehq#358) feat: expose `str_to_datetime` jinja macro (airbytehq#351) fix: update CDK migration for 6.34.0 (airbytehq#348) feat: Removes `stream_state` interpolation from CDK (airbytehq#320) fix(declarative): Pass `extra_fields` in `global_substream_cursor` (airbytehq#195) feat(concurrent perpartition cursor): Refactor ConcurrentPerPartitionCursor (airbytehq#331) feat(HttpMocker): adding support for PUT requests and bytes responses (airbytehq#342) chore: use certified source for manifest-only test (airbytehq#338) feat: check for request_option mapping conflicts in individual components (airbytehq#328) feat(file-based): sync file acl permissions and identities (airbytehq#260) fix: (CDK) (Connector Builder) - refactor the `MessageGrouper` > `TestRead` (airbytehq#332) fix(low code): Fix missing cursor for ClientSideIncrementalRecordFilterDecorator (airbytehq#334) feat(low-code): Add API Budget (airbytehq#314) chore(decoder): clean decoders and make csvdecoder available (airbytehq#326)
What
Resolving:
How
urls_extractor
todownload_target_extractor
.url_requester
todownload_target_requester
.interpolation context
forpolling_requester.path
anddownload_requester.path
to usecreation_response
anddownload_target
respectively, instead ofstream_slice
.User Impact
This update is considered a
Breaking Change
for the Customers who implemented theAsyncRetriever
currently; and might require customers to adjust their code as per the new naming conventions and interpolation contexts.Customers, who use the
AsyncRetriever
should change the interpolation they use as follows:polling_requester.path
:we switched to use the
creation_response
within the interpolation context, instead of relying on thestream_slice
context.Before:
After:
download_requester.path
:we switched to use the
download_target
within the interpolation context, instead of relying on thestream_slice
context.Before:
After:
urls_extractor
:the name of the component has changed
Before:
After:
url_requester
:the name of the component has changed
Before:
After:
Summary by CodeRabbit
New Features
Refactor
Documentation