Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shopify acceptance tests fail and block the PRs #197

Closed
rpopov opened this issue Jan 4, 2025 · 3 comments
Closed

Shopify acceptance tests fail and block the PRs #197

rpopov opened this issue Jan 4, 2025 · 3 comments

Comments

@rpopov
Copy link

rpopov commented Jan 4, 2025

Running the CAT tests locally on:

  • Oracle Linux 9.5
  • Oracle Linux 8.7
  • Fedora Workstation 41

Observations

  • Running the CDK acceptance tests locally with:
cd airbyte/airbyte-integrations/bases/connector-acceptance-test
poetry run pytest -p connector_acceptance_test.plugin --acceptance-test-config=../../connectors/source-shopify --timeout=300

fails with the container (Docker image: registry.dagger.io/engine v0.15.1) errors:

# expect more open files due to per-client SQLite databases
# many systems default to 1024 which is far too low
ulimit -n 1048576 || echo "cannot increase open FDs with ulimit, ignoring"
... ... ...
buildkitd: failed to create engine: failed to create network providers: CNI setup error: plugin type="bridge" failed (add): failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 3: iptables v1.8.10 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
iptables v1.8.10 (legacy): can't initialize iptables table `filter': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
...
=========================== short test summary info ============================
SKIPPED [1] ../app/connector_acceptance_test/plugin.py:65: Skipping TestConnectorAttributes.test_streams_define_primary_key: not found in the config.
SKIPPED [1] ../app/connector_acceptance_test/plugin.py:65: Skipping TestConnectorDocumentation.test_prerequisites_content: not found in the config.
SKIPPED [1] ../app/connector_acceptance_test/tests/test_core.py:142: Config is not provided
SKIPPED [1] ../app/connector_acceptance_test/tests/test_core.py:121: The previous and actual specifications are identical.
SKIPPED [1] ../app/connector_acceptance_test/tests/test_core.py:803: This tests currently leads to too much failures. We need to fix the connectors at scale first.
SKIPPED [2] ../app/connector_acceptance_test/tests/test_core.py:1269: Skipping the test for supported file types as it is only applicable for certified file-based connectors.
ERROR test_core.py::TestDiscovery::test_streams_have_valid_json_schemas[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_cursors_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_refs_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-allOf]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-not]
ERROR test_core.py::TestDiscovery::test_primary_keys_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_streams_has_sync_modes[inputs0] - Val...
ERROR test_core.py::TestDiscovery::test_additional_properties_is_true[inputs0]
ERROR test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - Val...
ERROR test_core.py::TestDiscovery::test_primary_keys_data_type[inputs0] - Val...
ERROR test_core.py::TestBasicRead::test_read[inputs0] - ValueError: No catalo...
ERROR test_core.py::TestBasicRead::test_read[inputs1] - ValueError: No catalo...
ERROR test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
ERROR test_incremental.py::TestIncremental::test_two_sequential_reads[inputs0]
ERROR test_incremental.py::TestIncremental::test_read_sequential_slices[inputs0]
ERROR test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
FAILED test_core.py::TestConnection::test_check[inputs0] - AssertionError: Co...
FAILED test_core.py::TestConnection::test_check[inputs2] - AssertionError: Co...
FAILED test_core.py::TestConnection::test_check[inputs4] - AssertionError: Co...
FAILED test_core.py::TestDiscovery::test_discover[inputs0] - IndexError: list...
== 4 failed, 28 passed, 7 skipped, 8 warnings, 16 errors in 114.62s (0:01:54) ==

Suggestion

  • Please investigate the issue of running the Dagger container locally. It maybe causes the mass errors reported.
  • Please investigate if there is dependency between the host/OS the container is run on. That could explain why those tests may succeed (if run on hosts with different OS).
  • Please update the documentation, specifically:
@rpopov
Copy link
Author

rpopov commented Jan 4, 2025

The problem is caused by the fact that I used:

# just for the current session, until restart
sudo modprobe iptable_nat

# include the nat module to survive restart
echo iptable_nat | sudo tee -a /etc/modules-load.d/modules

This solves the problem.

@rpopov
Copy link
Author

rpopov commented Jan 5, 2025

The tests still fail. These are:

# This test should not be part of TestSpec because it's testing the connector's docker image content, not the spec itself
# But it's cumbersome to declare a separate, non configurable, test class
# See https://github.com/airbytehq/airbyte/issues/15551
  • It checks if the last image of the specific connector is referred in the test configuration with the actual version. In the example of shopify, it compares image versionconnector_image: airbyte/source-shopify:dev in airbyte/airbyte-integrations/connectors/source-shopify/acceptance-test-config.yml (dev) matches the version set in dockerImageTag field in airbyte/airbyte-integrations/connectors/source-shopify/metadata.yaml (2.5.15).
  • This test logic depends only in the specific connector's code in the root project and does not depend on the changes in the CDK porject.
  • As shown in the shopify example, the "dev" version does not match "2.5.15", so this test fails.
  • test_check in airbyte/airbyte-integrations/bases/connector-acceptance-test/connector_acceptance_test/tests/test_core.py
    • This test verifies that the number of connection messages in success/failure status are exactly 1, extracting them from the logs.
    • This test highly depends on the infrastructure's performance and container caching (when run locally).
  • test_airbyte_trace_message_on_failure to "Ensure that a trace message is emitted when the connector crashes". It fails when run locally.

All these tests seem to be flaky, dependent on the infrastructure and the development practices. They seem to be not related to the CDK changes, but blocking them.

@rpopov rpopov changed the title Shopify acceptance tests fail due to Dagger/Buildkit container failure Shopify acceptance tests fail and block the PRs Jan 8, 2025
rpopov added a commit to rpopov/airbyte-python-cdk that referenced this issue Jan 13, 2025
At record extraction step, in each record add the service field $root holding a reference to:
* the root response object, when parsing JSON format
* the original record, when parsing JSONL format
that each record to process is extracted from.
More service fields could be added in future.
The service fields are available in the record's filtering and transform steps.

Avoid:
* reusing the maps/dictionaries produced, thus avoid building cyclic structures
* transforming the service fields in the Flatten transformation.

Explicitly cleanup the service field(s) after the transform step, thus making them:
* local for the filter and transform steps
* not visible to the next mapping and store steps (as they should be)
* not visible in the tests beyond the test_record_selector (as they should be)
This allows the record transformation logic to define its "local variables" to reuse
some interim calculations.

The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON.
Cannot be unified as that that irregularity is already used.

Update the development environment setup documentation
* to organize and present the setup steps explicitly
* to avoid misunderstandings and wasted efforts.

Update CONTRIBUTING.md to
* collect and organize the knowledge on running the test locally.
* state the actual testing steps.
* clarify and make explicit the procedures and steps.

The unit, integration, and acceptance tests in this exactly version succeed under Fedora 41, while
one of them fails under Oracle Linux 8.7. not related to the contents of this PR.
The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source.
See airbytehq#197
@rpopov
Copy link
Author

rpopov commented Jan 20, 2025

This is still a problem. It does not look like resolved. See: #214

rpopov added a commit to rpopov/airbyte-python-cdk that referenced this issue Feb 8, 2025
At record extraction step, in each record add the service field $root holding a reference to:
* the root response object, when parsing JSON format
* the original record, when parsing JSONL format
that each record to process is extracted from.
More service fields could be added in future.
The service fields are available in the record's filtering and transform steps.

Avoid:
* reusing the maps/dictionaries produced, thus avoid building cyclic structures
* transforming the service fields in the Flatten transformation.

Explicitly cleanup the service field(s) after the transform step, thus making them:
* local for the filter and transform steps
* not visible to the next mapping and store steps (as they should be)
* not visible in the tests beyond the test_record_selector (as they should be)
This allows the record transformation logic to define its "local variables" to reuse
some interim calculations.

The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON.
Cannot be unified as that that irregularity is already used.

Update the development environment setup documentation
* to organize and present the setup steps explicitly
* to avoid misunderstandings and wasted efforts.

Update CONTRIBUTING.md to
* collect and organize the knowledge on running the test locally.
* state the actual testing steps.
* clarify and make explicit the procedures and steps.

The unit, integration, and acceptance tests in this exactly version succeed under Fedora 41, while
one of them fails under Oracle Linux 8.7. not related to the contents of this PR.
The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source.
See airbytehq#197

Polish

Integrate the DpathEnhancingExtractor in the UI of Airbyte.
Created a DPath Enhancing Extractor
Refactored the record enhancement logic - moved to the extracted class
Split the tests of DPathExtractor and DPathEnhancingExtractor

Fix the failing tests:

FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_custom_components[test_create_custom_component_with_subcomponent_that_uses_parameters]
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_custom_components_do_not_contain_extra_fields
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_parse_custom_component_fields_if_subcomponent
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_page_increment
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_offset_increment
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[simple_unstructured_scenario]
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[no_file_extension_unstructured_scenario]

They faile because of comparing string and int values of the page_size (public) attribute.
Imposed an invariant:
  on construction, page_size can be set to a string or int
  keep only values of one type in page_size for uniform comparison (convert the values of the other type)
  _page_size holds the internal / working value
... unless manipulated directly.

Merged:
feat(low-code concurrent): Allow async job low-code streams that are incremental to be run by the concurrent framework (airbytehq#228)
fix(low-code): Fix declarative low-code state migration in SubstreamPartitionRouter (airbytehq#267)
feat: combine slash command jobs into single job steps (airbytehq#266)
feat(low-code): add items and property mappings to dynamic schemas (airbytehq#256)
feat: add help response for unrecognized slash commands (airbytehq#264)
ci: post direct links to html connector test reports (airbytehq#252) (airbytehq#263)
fix(low-code): Fix legacy state migration in SubstreamPartitionRouter (airbytehq#261)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(low-code): add DpathFlattenFields (airbytehq#227)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)

formatted

Update record_extractor.py

Trigger a new build. Hopefully, the integration test infrastructure is fixed.

Update CONTRIBUTING.md

Trigger a new build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants