Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: (CDK) (manifest-only) - add ability to debug manifest-only sources / destinations #407

Merged
merged 7 commits into from
Mar 12, 2025

Conversation

bazarnov
Copy link
Contributor

@bazarnov bazarnov commented Mar 11, 2025

What

Frequently, we had cases when we need/have to debug a manifest-only source/destination, but we don't have a fast and relatively easy way of doing that to reproduce the issue or introduce new features to the declarative_component_schema, etc.

How

  • Previously, debugging manifest-only sources and destinations required creating a copy of the source-declarative-manifest, modifying the pyproject.toml, installing an environment, and ensuring compatibility with versioning of the manifest and other related elements.
  • With this change, we have gained a highly flexible approach to debugging manifest-only sources and destinations by providing only the necessary components, such as config, catalog, state and the manifest itself.
  • Furthermore, the ability to configure the debugger based on your IDE further enhances this capability, allowing you to utilize breakpoints and gain a comprehensive understanding of the runtime state to identify and resolve any issues effectively.
  • see the README.md for more information.

Summary by CodeRabbit

  • New Features

    • Introduced a debug utility that enables manifest debugging with customizable command-line parameters.
  • Documentation

    • Added detailed guidance for setting up and running debugging sessions within your IDE, covering necessary configuration and resource management.
  • Chores

    • Improved repository file management to exclude extraneous files while preserving essential directory structures.

@github-actions github-actions bot added bug Something isn't working security labels Mar 11, 2025
@bazarnov bazarnov self-assigned this Mar 11, 2025
@bazarnov bazarnov marked this pull request as ready for review March 12, 2025 10:03
@Copilot Copilot bot review requested due to automatic review settings March 12, 2025 10:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds the ability to debug "manifest-only" sources/destinations in the CDK by introducing a dedicated debug script and providing detailed VSCode configuration instructions in the README.

  • Added a new README with step-by-step instructions to configure and use the debugger.
  • Introduced a debug_manifest.py file to launch the debugging session using the manifest.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
debug_manifest/README.md Documentation for configuring the debugger with VSCode and resource placement for manifest-only connectors.
debug_manifest/debug_manifest.py A new script to initiate debugging of manifest-only sources.

Copy link
Contributor

coderabbitai bot commented Mar 12, 2025

📝 Walkthrough

Walkthrough

The pull request introduces three new files within the debug_manifest directory. A .gitignore file is added to ignore all files in the resources subfolder except for .gitkeep, ensuring that an empty directory is maintained. A README.md file provides step-by-step instructions for configuring a Visual Studio Code debugger for the module. Finally, a new Python file (debug_manifest.py) implements a debug_manifest function that sets up and launches a debug process using a YAML configuration from the resources directory.

Changes

File Path Change Summary
debug_manifest/.gitignore New file that ignores all files in the resources directory except .gitkeep to preserve an empty folder.
debug_manifest/README.md New documentation file that details how to configure VSCode for debugging the debug_manifest module, including instructions for creating a tailored launch.json.
debug_manifest/debug_manifest.py New Python script adding the debug_manifest function, which creates a YamlDeclarativeSource from a YAML configuration and invokes a launch process with provided command-line arguments.

Sequence Diagram(s)

sequenceDiagram
  participant U as User
  participant DM as debug_manifest.py
  participant YDS as YamlDeclarativeSource
  participant L as launch Function

  U->>DM: Execute script with command-line arguments
  DM->>YDS: Create source instance from YAML configuration
  DM->>L: Invoke launch(source, args)
  L->>U: Execute debugging process and respond
Loading

Suggested labels

enhancement, docs, airbyte-python-cdk

Suggested reviewers

  • bnchrch
  • aaronsteers
  • maxi297

Would you like to add any further clarifications or additional diagrams, wdyt?


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
debug_manifest/debug_manifest.py (1)

13-15: Configuration looks good, but have you considered making the path configurable?

The hardcoded path works fine for the typical use case, but would it be helpful to allow users to override this path via an environment variable or command-line argument for more flexibility? wdyt?

debug_manifest/README.md (4)

51-51: Minor grammar improvement opportunity.

Would this read better with a comma after "Alternatively" and correcting "swith" to "switch"?

- 8. Replace the `"python": "<PATH_TO_CDK_ENV>/bin/python"` with the correct interpreter `PATH` pointing to the `CDK env` installed from Step `7` (use `which python` to have the complete python path), to wire the CDK env to the debugger. Alternatively you can swith the default interpreter you use in your IDE.
+ 8. Replace the `"python": "<PATH_TO_CDK_ENV>/bin/python"` with the correct interpreter `PATH` pointing to the `CDK env` installed from Step `7` (use `which python` to have the complete python path), to wire the CDK env to the debugger. Alternatively, you can switch the default interpreter you use in your IDE.
🧰 Tools
🪛 LanguageTool

[typographical] ~51-~51: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...), to wire the CDK env to the debugger. Alternatively you can swith the default interpreter y...

(RB_LY_COMMA)


54-54: Small typo in documentation.

- * These resources are ignorred by `git`, in the `.gitignore`, thus should not be commited
+ * These resources are ignored by `git`, in the `.gitignore`, thus should not be committed

56-59: Resource setup instructions are clear, but structure could be improved.

The instructions are clear, but three consecutive sentences begin with "Put the...". Would you consider restructuring them for better readability? Perhaps using a table or a different format? wdyt?

🧰 Tools
🪛 LanguageTool

[style] ~58-~58: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...red catalogfor the target source). 3. Put themanifest.yamlinside the/airbyt...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


61-61: Small typo in heading.

- ## Debuging Steps
+ ## Debugging Steps
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 969dec1 and 3834109.

📒 Files selected for processing (3)
  • debug_manifest/.gitignore (1 hunks)
  • debug_manifest/README.md (1 hunks)
  • debug_manifest/debug_manifest.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
debug_manifest/README.md

[typographical] ~51-~51: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...), to wire the CDK env to the debugger. Alternatively you can swith the default interpreter y...

(RB_LY_COMMA)


[style] ~58-~58: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...red catalogfor the target source). 3. Put themanifest.yamlinside the/airbyt...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (5)
debug_manifest/.gitignore (1)

1-5: LGTM! Well-structured .gitignore for managing resources directory.

The .gitignore file is correctly configured to ignore all files in the resources directory while preserving the directory structure through .gitkeep. This approach ensures that developers can have their local configuration files without accidentally committing them, which is ideal for debugging purposes.

debug_manifest/debug_manifest.py (2)

17-23: Function implementation is clean and well-documented.

The function is well-typed with appropriate annotations and has a clear docstring. The implementation is simple and directly leverages the existing CDK launch functionality.


25-29: Entry point looks good.

The main block correctly initializes the source with the configuration and passes command-line arguments to the debug_manifest function.

debug_manifest/README.md (2)

14-47: VSCode configuration example is clear and well-documented.

The JSON configuration example includes helpful comments for each argument and covers all the essential parameters needed for debugging manifest-only sources and destinations.


67-67: Great summary that captures the value of this feature.

This concluding line effectively summarizes the benefits of this new debugging capability, highlighting how it brings parity between manifest-only sources and regular Python sources.

@bazarnov
Copy link
Contributor Author

Automation ideas for the future:

  • autogenerated catalog
  • extract testing values > config.json (or similar) during the export manifest in Builder

Ideally, it would be very handy if the export functionality had a feature called Export for Debugging (for admins only) that comes as a zip package with manifest + config, which we can directly use while debugging.

bazarnov and others added 5 commits March 12, 2025 12:09
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
debug_manifest/README.md (5)

1-7: Clarify the introduction section?
The introduction and headings in lines 1–7 provide a clear context for the debugging setup. Would you consider adding a brief "Overview" or "Prerequisites" sub-section for developers who might be new to the CDK? wdyt?


14-47: Review the JSON snippet details?
The provided launch.json configuration is comprehensive and illustrative. Note that while standard JSON does not support comments, VSCode’s configuration uses JSONC which permits them. Would you consider adding a brief note (perhaps in a comment preceding the block) clarifying that the inline comments are for documentation purposes only? wdyt?


49-51: Improve readability in the interpreter instructions.
The instructions after the JSON snippet are clear. However, on line 51, consider adding a comma after “Alternatively” for better clarity. For example:

-...to wire the CDK env to the debugger. Alternatively you can switch the default interpreter you use in your IDE.
+...to wire the CDK env to the debugger. Alternatively, you can switch the default interpreter you use in your IDE.

Would you be open to this change? wdyt?

🧰 Tools
🪛 LanguageTool

[typographical] ~51-~51: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...), to wire the CDK env to the debugger. Alternatively you can switch the default interpreter ...

(RB_LY_COMMA)


52-57: Enhance details for resource setup?
The steps for setting up the necessary resources (lines 52–57) are succinct and clear. Would you consider adding a brief description of what each of the resource files (e.g., config.json, catalog.json, manifest.yaml, and optionally state.json) should contain? This might help developers better prepare the required files. wdyt?


59-63: Clarify the "Iterate" instruction.
The debugging steps are generally clear. However, the instruction on line 62—"Iterate over the 2 and 3"—might be a bit ambiguous. Would you consider rephrasing it to "Repeat steps 2 and 3" to improve clarity? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3834109 and 933c9dc.

📒 Files selected for processing (1)
  • debug_manifest/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
debug_manifest/README.md

[typographical] ~51-~51: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...), to wire the CDK env to the debugger. Alternatively you can switch the default interpreter ...

(RB_LY_COMMA)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
debug_manifest/README.md (2)

8-12: Great step-by-step instructions!
The steps outlining the process to create and configure the launch.json file are well organized and easy to follow. Nice job!


64-64: Effective final summary!
The concluding note effectively communicates that users can now run manifest-only sources with full command support and debugging options. This is clear and concise.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bazarnov bazarnov merged commit 741a2a0 into main Mar 12, 2025
21 checks passed
@bazarnov bazarnov deleted the baz/cdk/add-manifest-only-debug-capabilities branch March 12, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants