fix: (CDK) (AsyncRetriever) - Use the `Nested Decoders` to decode the streaming responses, instead of `ResponseToFileExtractor` #378

bazarnov · 2025-03-04T17:27:29Z

What

Resolving:

How

GzipDecoder Updates:
- Added the ability to fall back to reading records "as is" if the data is not actually gzip-compressed.
Updated the decoder notation in model_to_component_factory.py to distinguish between the decoder used where the ResponseToFileExtractor support is terminated. This is the deprecated component that should be replaced by the "nested decoders" in the future. Please refer to the example below for more information.
The download_extractor should be supplied as a part of the manifest, only when we have to deal with the field pointer to the data (basically when data is nested within the response), the example is here

Side-Fix:

Fixed the https://github.com/airbytehq/airbyte-internal-issues/issues/11869 as a small regression after this CDK update
- reverted the test case to confirm the fix works and included more examples of what is expected from the _join_url method.

User Impact

No impact is expected. This is not a breaking change.

Before we supported the CSV-only:

download_extractor:
  type: ResponseToFileExtractor

After:

For JSON-Lines:

download_decoder:
    type: JsonlDecoder

For CSV

download_decoder:
    type: CsvDecoder

For Gzipped JSON-Lines:

download_decoder:
    type: GzipDecoder
    decoder:
      type: JsonlDecoder

For Gzipped CSV:

download_decoder:
    type: GzipDecoder
    decoder:
      type: CsvDecoder

Summary by CodeRabbit

New Features
- Introduced support for handling compressed responses in various decoder classes.
- Added methods to check if responses are compressed, enhancing data processing capabilities.
- New test cases for gzip handling in composite raw decoder, including scenarios for both compressed and non-compressed responses.
- Added a new abstract base class for parsers to standardize parsing functionality.
Refactor
- Updated JSONL decoding to utilize instance state for consistent streaming behavior across data formats.
- Improved documentation and readability in HTTP requester methods.
- Streamlined method signatures and return types across various decoder classes.
Bug Fixes
- Corrected URL construction logic to retain trailing slashes in paths.

coderabbitai · 2025-03-04T17:50:35Z

📝 Walkthrough

Walkthrough

The changes remove a placeholder comment in the file extractor that referred to planned JSON support. Additionally, the create_jsonl_decoder method in the component factory has been refactored from a static to an instance method with updated logic for the stream_response parameter. The CompositeRawDecoder and ZipfileDecoder classes have been enhanced to handle compressed responses, with new methods introduced to check for compression. Documentation improvements were made in the http_requester.py file, focusing on clarity without altering functionality.

Changes

File	Change Summary
`airbyte_cdk/sources/declarative/extractors/.../response_to_file_extractor.py`	Removed the comment suggesting JSON support in the `_read_with_chunks` method.
`airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`	Converted `create_jsonl_decoder` from a static to an instance method and updated its signature and `stream_response` handling to mirror CSV logic.
`airbyte_cdk/sources/declarative/decoders/.../composite_raw_decoder.py`	Removed `Parser` class; updated `parse` methods in various parsers to return `PARSER_OUTPUT_TYPE`; restructured `CompositeRawDecoder` with new methods.
`airbyte_cdk/sources/declarative/decoders/.../zipfile_decoder.py`	Updated `decode` method to return `DECODER_OUTPUT_TYPE` and reformatted call to `self.parser.parse`.
`airbyte_cdk/sources/declarative/requesters/http_requester.py`	Reformatted method parameters and expanded documentation for `_join_url` without altering functionality.
`unit_tests/sources/declarative/requesters/test_http_requester.py`	Adjusted expected output URL in `test_send_request_stream_slice_next_page_token()` to correctly reflect the handling of trailing slashes.
`unit_tests/sources/declarative/decoders/test_composite_decoder.py`	Enhanced tests for gzip handling, including new cases for compressed and non-compressed responses.
`unit_tests/sources/declarative/decoders/test_zipfile_decoder.py`	Updated mock response to include `Content-Encoding` header for ZIP compression in `test_zipfile_decoder_with_single_file_response`.
`airbyte_cdk/sources/declarative/decoders/decoder.py`	Introduced `DECODER_OUTPUT_TYPE` type alias and updated `decode` method signature in `Decoder` class.
`airbyte_cdk/sources/declarative/decoders/decoder_parser.py`	Added new `Parser` class with abstract `parse` method and type aliases for parser-related structures.

Possibly related PRs

airbytehq/airbyte-python-cdk#356: This PR also discusses future JSON support for _read_with_chunks, aligning closely with the removed placeholder.

Suggested labels

enhancement, area/async-retriever, area/connector-builder

Suggested reviewers

maxi297
natikgadzhi

Wdyt about adding more reviewers to ensure a broader perspective on the changes?

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (14)

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
705-708: Consider adding field validation for file_type

The new file_type field looks good and provides the necessary flexibility to support both CSV and JSONL formats. I noticed it currently accepts any string though - would it make sense to restrict the values to just "csv" and "jsonl" using an Enum? This could prevent configuration errors and make the API more robust. Also, there's a small typo in the title: "storred" should be "stored".
-    file_type: Optional[str] = Field(
-        "csv",
-        title="The file type in which the response data is storred. Supported types are [csv, jsonl].",
-    )
+    class FileType(str, Enum):
+        CSV = "csv"
+        JSONL = "jsonl"
+
+    file_type: Optional[FileType] = Field(
+        FileType.CSV,
+        title="The file type in which the response data is stored. Supported types are [csv, jsonl].",
+    )
What do you think? This would make the allowed values explicit in the schema.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1681-1685: Suggestion: Fix Typo and Enhance Validation for file_type
I noticed a minor typo in the title ("storred" should be "stored") and would suggest adding an explicit enum constraint to ensure only the supported file types (csv and jsonl) are accepted. Would you be open to these adjustments, wdyt?
-      file_type:
-        title: The file type in which the response data is storred. Supported types are [csv, jsonl].
-        type: string
-        default: csv
+      file_type:
+        title: The file type in which the response data is stored. Supported types are [csv, jsonl].
+        type: string
+        enum: [csv, jsonl]
+        default: csv
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

1995-1997: Consider using the FileTypes enum for defaulting to CSV.
Currently, you default to "csv" via model.file_type or "csv". Would you like to use FileTypes.CSV.value here for consistency with the new enum, or is the current approach already meeting your needs? wdyt?
airbyte_cdk/sources/declarative/extractors/response_to_file_extractor.py (11)
24-26: The FileTypes enum is well introduced.
Would you consider expanding it to handle other file formats in the future (e.g., Parquet)? wdyt?

32-43: Documentation is thorough.
It's great that you mention memory constraints, CSV streaming, and so on. Maybe add a tiny note to highlight new JSONL support in the docstring? wdyt?

47-47: Defaulting file_type to CSV is consistent.
Using the enum’s default value (like FileTypes.CSV.value) might be safer. Would that help prevent typos if changed later? wdyt?

52-74: Docstring for extract_records is clear.
Would you consider adding a brief usage example to illustrate how to call extract_records with a JSONL file? wdyt?

118-135: Temporary file path generation.
Would you consider using Python’s tempfile module (e.g., tempfile.NamedTemporaryFile) for more secure and robust temp file handling? wdyt?

175-205: Chunk reading from CSV.
Your approach with pd.read_csv in chunks is great for memory usage. Would you consider allowing users to configure chunk_size more dynamically? wdyt?

206-233: JSON lines chunk reading.
This mirrors the CSV strategy nicely. Maybe unify the chunk size handling or document it consistently for both CSV and JSONL? wdyt?

235-253: Conditional approach to file types.
You might consider a more direct dispatch using the FileTypes enum instead of if-else checks. That could simplify or scale better if file types expand. wdyt?

256-258: Conditional for CSV is straightforward.
Are you comfortable returning an error if a non-enum file type is passed here, or did you want additional fallback logic? wdyt?

263-264: Switching to logger.info.
Previously, some might log empty data as a warning. Are you sure this info-level is enough to alert you in debugging? wdyt?

267-268: Typographical fix for 'occured'.
Would you like to correct "occured" to "occurred"? wdyt?
-    message = "ResponseToFileExtractor._read_with_chunks(): The IO/Error occured while reading the data from file."
+    message = "ResponseToFileExtractor._read_with_chunks(): The IO/Error occurred while reading the data from file."

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0edebbe and 5167da5.

📒 Files selected for processing (4)

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
airbyte_cdk/sources/declarative/extractors/response_to_file_extractor.py (6 hunks)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: Pytest (Fast)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)

🔇 Additional comments (4)

airbyte_cdk/sources/declarative/extractors/response_to_file_extractor.py (4)

10-10: Import of Enum looks good!
No issues spotted. The new enum usage is appropriate and improves code clarity.

153-154: Use of _get_file_path() is correct.
No issues spotted here; good job centralizing filename generation logic.

155-156: Snippet for writing to file is fine.
The context management and response closing follow best practices.

259-261: JSONL branch looks good.
No further concerns – the logic is consistent for reading JSONL.

airbyte_cdk/sources/declarative/declarative_component_schema.yaml

maxi297

I just want to make sure we don't introduce something in the declarative language that we would need to revert therefore creating a breaking change

…port-to-response-to-file-extractor

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (1)

53-59: Documentation improvement looks good. Thoughts on adding more context?

The updated documentation clearly explains the behavior when non-gzipped data is encountered. This matches the implementation and helps future developers understand the fallback mechanism.

What do you think about adding a brief note on why this fallback behavior is important for file extraction, particularly with respect to the new JSONL support mentioned in the PR? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5167da5 and 28756a2.

📒 Files selected for processing (3)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (1 hunks)
airbyte_cdk/sources/declarative/extractors/response_to_file_extractor.py (0 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)

💤 Files with no reviewable changes (1)

airbyte_cdk/sources/declarative/extractors/response_to_file_extractor.py

🚧 Files skipped from review as they are similar to previous changes (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-amplitude' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: SDM Docker Image Build
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Analyze (python)

🔇 Additional comments (2)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (2)

38-46: Clear and useful helper method to improve error handling. Nice addition!

The new _reset_reader_pointer method is well-documented and addresses a specific need: resetting the reader position after failed gzip decompression. This is essential for the improved error handling in the parse method.

61-67: Excellent error handling implementation!

This is a robust approach to handling potentially non-gzipped data:

Attempts gzip decompression first

Falls back gracefully on error

Resets the pointer before sending to inner parser

Logs an appropriate warning

This will make the parser more resilient when processing various input formats and prevent unnecessary failures when data isn't actually gzipped.

Could we consider adding a unit test that specifically verifies this fallback mechanism works with both gzipped and non-gzipped data? wdyt?

maxi297

A couple of concerns that I'm curious to have your input on

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py

…port-to-response-to-file-extractor

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (1)
70-76: Remove debug print statement

There's a debugging print statement that should be removed before production.
-            print(f"\n\nHERE\n\n")
This statement could clutter logs in production, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0be859 and dbed4d6.

📒 Files selected for processing (5)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (7 hunks)
airbyte_cdk/sources/declarative/decoders/zipfile_decoder.py (3 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6 hunks)
unit_tests/sources/declarative/decoders/test_composite_decoder.py (3 hunks)
unit_tests/sources/declarative/decoders/test_zipfile_decoder.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-amplitude' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: SDM Docker Image Build
GitHub Check: Analyze (python)

🔇 Additional comments (10)

unit_tests/sources/declarative/decoders/test_zipfile_decoder.py (1)

46-51: Test now properly simulates compression headers

The addition of the Content-Encoding header is a good improvement to test the behavior of ZipfileDecoder with compressed responses. This aligns well with the new is_compressed method in the implementation.

airbyte_cdk/sources/declarative/decoders/zipfile_decoder.py (3)

16-16: Good import consolidation

Importing both COMPRESSION_TYPES and Parser from the same module is a clean approach, making the dependencies more explicit.

29-34: Well-documented is_compressed method

The new method clearly checks if the response is compressed based on the Content-Encoding header. The documentation explains the purpose well.

44-47: Pass compression status to parser

The update to pass the compression status to the parser's parse method ensures consistent handling of compressed data. This is a good implementation that aligns with changes in CompositeRawDecoder.

unit_tests/sources/declarative/decoders/test_composite_decoder.py (3)

71-72: Added Content-Encoding header for better test coverage

Good addition of the Content-Encoding header to test gzip parsing with the proper headers, which makes the test more realistic.

109-124: Enhanced test coverage for gzip parsing with JSON lines

The updated test properly verifies that the GzipParser with JsonLineParser can handle gzipped content with appropriate headers.

126-144: Added test for non-compressed data handling

Excellent addition of a test case that verifies the GzipParser can handle non-compressed data. This ensures the decoder is robust when encountering various response types.

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (3)

25-34: Good list of compression types

Comprehensive list of compression types to check against. This improves the decoder's ability to handle various compression formats.

190-195: Well-implemented compression detection

The is_compressed method is clear and concise, checking the Content-Encoding header against the predefined compression types.

206-206: Pass compression status to parser

Good implementation of passing the compression status to the parser's parse method. This ensures consistent handling of compressed data.

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

airbyte_cdk/sources/declarative/decoders/decoder_parser.py (3)

1-13: Imports and logger setup look good!

The imports are comprehensive and the logger setup is appropriately named. One small suggestion - would adding a module-level docstring help explain the purpose of this new file in the decoder ecosystem? Something brief that explains how it relates to the nested decoders approach mentioned in the PR objectives? wdyt?

18-26: Parser abstract base class is well-structured.

The abstract base class with the parse method provides a good foundation for different parser implementations. The docstring is concise but informative.

Could we enhance the docstring to provide more details about expected input data formats and the structure of the yielded dictionaries? This would help implementers understand their responsibilities better. wdyt?

28-30: Consider adding more documentation for complex type definitions.

These type aliases define important structures but their purpose and composition might not be immediately clear to other developers.

Would it help to add comments explaining:

What the three elements in the PARSERS_TYPE tuple represent (looks like content types, file extensions, and parser instances?)

How PARSERS_BY_HEADER_TYPE is used to select parsers based on headers?

This would make the code more maintainable as the parser ecosystem grows. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96f41da and bcc2c36.

📒 Files selected for processing (6)

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py (5 hunks)
airbyte_cdk/sources/declarative/decoders/decoder.py (2 hunks)
airbyte_cdk/sources/declarative/decoders/decoder_parser.py (1 hunks)
airbyte_cdk/sources/declarative/decoders/zipfile_decoder.py (2 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3 hunks)
unit_tests/sources/declarative/decoders/test_composite_decoder.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

airbyte_cdk/sources/declarative/decoders/decoder.py
airbyte_cdk/sources/declarative/decoders/zipfile_decoder.py
unit_tests/sources/declarative/decoders/test_composite_decoder.py
airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-amplitude' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: SDM Docker Image Build
GitHub Check: Pytest (Fast)
GitHub Check: Analyze (python)

🔇 Additional comments (5)

airbyte_cdk/sources/declarative/decoders/decoder_parser.py (1)

15-15: Type alias for parser output looks good.

This type alias clearly defines the expected output format for parsers - a generator yielding dictionaries. It's well-defined and will help with type checking.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)

2196-2202: Changed from static to instance method, aligning with create_csv_decoder pattern.

The create_jsonl_decoder method has been modified from a static method to an instance method by adding the self parameter. This change enables access to instance variables such as self._emit_connector_builder_messages, allowing the method to configure the stream_response parameter consistently with the create_csv_decoder method.

2204-2230: Improved GzipDecoder to handle both compressed and uncompressed data.

The enhanced implementation now intelligently processes responses based on content encoding headers, with a fallback mechanism for uncompressed data. This aligns with the PR objectives of using nested decoders for better stream handling.

I especially like the detailed comments explaining the behavior with different response streaming approaches.

2897-2902: Method parameter formatting updated for consistency.

Parameters passed to download_target_extractor are now formatted with line breaks between each parameter, which improves readability and maintains consistency with other method calls in the file.

2778-2782: Method parameter formatting improved for readability.

Parameters for model.download_paginator are now consistently formatted with one parameter per line, which improves code readability and maintainability.

maxi297

If CI passes, I'm all in on this change. Thanks for diving in the requests lib shenanigans with me ❤️

add

2120a18

bazarnov self-assigned this Mar 4, 2025

github-actions bot added bug Something isn't working security labels Mar 4, 2025

fix linter issues

5167da5

bazarnov marked this pull request as ready for review March 4, 2025 17:46

bazarnov requested review from maxi297 and aaronsteers March 4, 2025 17:46

coderabbitai bot reviewed Mar 4, 2025

View reviewed changes

coderabbitai bot approved these changes Mar 4, 2025

View reviewed changes

maxi297 reviewed Mar 4, 2025

View reviewed changes

airbyte_cdk/sources/declarative/declarative_component_schema.yaml Outdated Show resolved Hide resolved

maxi297 reviewed Mar 4, 2025

View reviewed changes

bazarnov added 2 commits March 5, 2025 18:20

updated after the review

030f106

Merge remote-tracking branch 'origin/main' into baz/cdk/add-jsonl-sup…

28756a2

…port-to-response-to-file-extractor

coderabbitai bot reviewed Mar 5, 2025

View reviewed changes

removed non-used constant

0d85176

bazarnov requested a review from maxi297 March 5, 2025 16:38

bazarnov mentioned this pull request Mar 5, 2025

Source Sendgrid: Update manifest for adapting changes with AsyncRetriever airbytehq/airbyte#55185

Merged

fixed stripping trailing slash when the path provides values

f0be859

bazarnov changed the title ~~fix: (CDK) (ResponseToFileExtractor) - Add JsonL output file-type support~~ fix: (CDK) (AsyncRetriever) - Use the Nested Decoders to decode the streaming responses, instead of ResponseToFileExtractor Mar 6, 2025

maxi297 reviewed Mar 6, 2025

View reviewed changes

bazarnov added 3 commits March 6, 2025 23:11

updated after the review

559b280

Merge remote-tracking branch 'origin/main' into baz/cdk/add-jsonl-sup…

4f9bce7

…port-to-response-to-file-extractor

updated COMPRESSED_TYPES and fixed the tests

dbed4d6

coderabbitai bot reviewed Mar 6, 2025

View reviewed changes

bazarnov added 4 commits March 7, 2025 00:25

reverted operational_decoder to decoder

ead326d

removed print

f1a71a6

updated compression types checks

3750f6d

nit

96f41da

maxi297 mentioned this pull request Mar 7, 2025

Proposal for generic composite raw decoder #395

Closed

maxi297 reviewed Mar 7, 2025

View reviewed changes

airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py Outdated Show resolved Hide resolved

updated after the review + maximes proposal

bcc2c36

coderabbitai bot reviewed Mar 7, 2025

View reviewed changes

maxi297 approved these changes Mar 7, 2025

View reviewed changes

bazarnov merged commit 4f9fd20 into main Mar 7, 2025
25 checks passed

bazarnov deleted the baz/cdk/add-jsonl-support-to-response-to-file-extractor branch March 7, 2025 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: (CDK) (AsyncRetriever) - Use the `Nested Decoders` to decode the streaming responses, instead of `ResponseToFileExtractor` #378

fix: (CDK) (AsyncRetriever) - Use the `Nested Decoders` to decode the streaming responses, instead of `ResponseToFileExtractor` #378

bazarnov commented Mar 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 4, 2025 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

maxi297 left a comment

coderabbitai bot left a comment

maxi297 left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

maxi297 left a comment

fix: (CDK) (AsyncRetriever) - Use the Nested Decoders to decode the streaming responses, instead of ResponseToFileExtractor #378

fix: (CDK) (AsyncRetriever) - Use the Nested Decoders to decode the streaming responses, instead of ResponseToFileExtractor #378

Conversation

bazarnov commented Mar 4, 2025 • edited by coderabbitai bot Loading

What

How

User Impact

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Mar 4, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

maxi297 left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

maxi297 left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

maxi297 left a comment

Choose a reason for hiding this comment

fix: (CDK) (AsyncRetriever) - Use the `Nested Decoders` to decode the streaming responses, instead of `ResponseToFileExtractor` #378

fix: (CDK) (AsyncRetriever) - Use the `Nested Decoders` to decode the streaming responses, instead of `ResponseToFileExtractor` #378

bazarnov commented Mar 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 4, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)