Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉Source HubSpot: Adds form_submission and property_history streams #7787

Conversation

tinomerl
Copy link
Contributor

@tinomerl tinomerl commented Nov 9, 2021

What

How

Describe the solution

Recommended reading order

  1. x.java
  2. y.python

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Connector Generator

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.


This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Nov 9, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ tinomerl
✅ marcosmarxm
❌ Tino Merl


Tino Merl seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Nov 9, 2021
@@ -34,5 +34,5 @@ COPY source_hubspot ./source_hubspot
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=0.1.23
LABEL io.airbyte.version=0.2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing to 0.2.0 and not 0.1.24?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcosmarxm thanks for the feedback. I bumped the minor version up since the two new streams add functionality and are not bugfixes.

@tinomerl tinomerl requested a review from marcosmarxm November 15, 2021 09:38
…bmissions' of github.com:tinomerl/airbyte into marcos/test-pr-7787
@@ -316,7 +316,7 @@ def _field_to_datetime(value: Union[int, str]) -> pendulum.datetime:
def _filter_old_records(self, records: Iterable) -> Iterable:
"""Skip records that was updated before our start_date"""
for record in records:
updated_at = record[self.updated_at_field]
updated_at = record.get(self.updated_at_field) or None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the .get method already use None as default

@marcosmarxm
Copy link
Member

I tried to run the tests locally and didn't work. Can you share the results from your tests, check https://docs.airbyte.io/connector-development/testing-connectors/source-acceptance-tests-reference#architecture-of-standard-tests or the checklist Updating connector

@tinomerl
Copy link
Contributor Author

@marcosmarxm currently running in timeouts when running the acceptance tests for basic_read and full_refresh. everything else works fine. also fixed your remark with .get but didn't commit because of the current timeouts.

i guess it's because there are no explicit endpoints for form submissions. the contact property changes could be reworked if i use the endpoint for recently changed contacts. but the problem with the form submissions is that in order to get the values for a form submission one needs first to get every form available and afterwards get the form submission data. currently there is no specific endpoint to get every form submission independently from the form.

@marcosmarxm
Copy link
Member

@marcosmarxm currently running in timeouts when running the acceptance tests for basic_read and full_refresh. everything else works fine. also fixed your remark with .get but didn't commit because of the current timeouts.

i guess it's because there are no explicit endpoints for form submissions. the contact property changes could be reworked if i use the endpoint for recently changed contacts. but the problem with the form submissions is that in order to get the values for a form submission one needs first to get every form available and afterwards get the form submission data. currently there is no specific endpoint to get every form submission independently from the form.

You can cache the response from forms? Or, you need to run a full refresh to forms?

@tinomerl
Copy link
Contributor Author

@marcosmarxm currently running in timeouts when running the acceptance tests for basic_read and full_refresh. everything else works fine. also fixed your remark with .get but didn't commit because of the current timeouts.
i guess it's because there are no explicit endpoints for form submissions. the contact property changes could be reworked if i use the endpoint for recently changed contacts. but the problem with the form submissions is that in order to get the values for a form submission one needs first to get every form available and afterwards get the form submission data. currently there is no specific endpoint to get every form submission independently from the form.

You can cache the response from forms? Or, you need to run a full refresh to forms?

@marcosmarxm how would i do the caching? I reduced the time range which will be called. last week i tried with about a year. also removed the property_history stream from the full_refresh since it creates a big output and is mostly the culprit for the timeouts. The only two tests failing right now are the read inputs in the acceptance tests. that is because we have a field in hubspot which contains a date but hubspot says it's property is a datetime. and the tests fail therefore. i committed everything and attached the logs.

Unit tests
python -m pytest unit_tests
Test session starts (platform: linux, Python 3.8.10, pytest 6.1.2, pytest-sugar 0.9.4)
cachedir: .pytest_cache
rootdir: /mnt/c/Users/Tino.Merl/Documents/repos/airbyte_dev, configfile: pytest.ini
plugins: sugar-0.9.4, timeout-1.4.2, requests-mock-1.8.0
collecting ... {"type": "LOG", "log": {"level": "INFO", "message": "Backing off get(...) for 0.0s (source_hubspot.errors.HubspotRateLimited: 429 Rate Limit Exceeded: API rate-limit has been reached until 0 seconds. See https://developers.hubspot.com/docs/api/usage-details)"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Rate limit reached. Sleeping for 0 seconds"}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::test_client_backoff_on_limit_reached ✓                                                                                           2% ▍         {"type": "LOG", "log": {"level": "INFO", "message": "Backing off get(...) for 5.0s (requests.exceptions.HTTPError: 500 Server Error: None for url: https://api.hubapi.com/properties/v2/contact/properties?hapikey=wrong_key)"}}
{"type": "LOG", "log": {"level": "INFO", "message": "500 Server Error: None for url: https://api.hubapi.com/properties/v2/contact/properties?hapikey=wrong_key"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Caught retryable error after 1 tries. Waiting 5 more seconds then retrying..."}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::test_client_backoff_on_server_error ✓                                                                                            5% ▌         {"type": "LOG", "log": {"level": "WARN", "message": "Stream `None` cannot be procced. This hapikey (THIS-IS-THE-API_KEY) does not have proper permissions! (requires any of [automation-access])"}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::test_wrong_permissions_api_key ✓                                                                                                 8% ▊         
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::TestSplittingPropertiesFunctionality.test_splitting_properties ✓                                                                10% █         
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::TestSplittingPropertiesFunctionality.test_stream_with_splitting_properties ✓                                                    12% █▍        
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::TestSplittingPropertiesFunctionality.test_stream_with_splitting_properties_with_pagination ✓                                    15% █▌        
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_client.py::TestSplittingPropertiesFunctionality.test_stream_with_splitting_properties_with_new_record ✓                                    18% █▊        
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[string-expected0] ✓                                                            20% ██        
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[integer-expected1] ✓                                                           22% ██▍       
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[number-expected2] ✓                                                            25% ██▌       
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[bool-expected3] ✓                                                              28% ██▊       
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[boolean-expected4] ✓                                                           30% ███       
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[enumeration-expected5] ✓                                                       32% ███▍      
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[object-expected6] ✓                                                            35% ███▌      
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[array-expected7] ✓                                                             38% ███▊      
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[date-expected8] ✓                                                              40% ████      
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[date-time-expected9] ✓                                                         42% ████▍     
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[datetime-expected10] ✓                                                         45% ████▌     
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[json-expected11] ✓                                                             48% ████▊     
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_field_type_format_converting[phone_number-expected12] ✓                                                     50% █████     {"type": "LOG", "log": {"level": "WARN", "message": "Unsupported type _unsupported_field_type_ found"}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[_unsupported_field_type_-expected0] ✓                                             52% █████▍    {"type": "LOG", "log": {"level": "WARN", "message": "Unsupported type None found"}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[None-expected1] ✓                                                                 55% █████▌    {"type": "LOG", "log": {"level": "WARN", "message": "Unsupported type 1 found"}}

 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[1-expected2] ✓                                                                    57% █████▊    
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types0-some_field-None-None-None] ✓                                      60% ██████    
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types1-some_field-None-None-None] ✓                                      62% ██████▍   
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types2-some_field-None-None-None] ✓                                      65% ██████▌   
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types3-some_field-None-None-None] ✓                                      68% ██████▊   
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types4-some_field-None-None-None] ✓                                      70% ███████   
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[string-some_field-test-None-test] ✓                                                     72% ███████▍  
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types6-some_field-123.456-None-123.456] ✓                                75% ███████▌  
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types7-user_id-123-None-123] ✓                                           78% ███████▊  
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types8-some_field-123-None-123] ✓                                        80% ████████  
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types9-some_field--None-] ✓                                              82% ████████▍ 
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types10-some_field--None-None] ✓                                         85% ████████▌ 
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types11-some_field--None-None] ✓                                         88% ████████▊ 
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types12-some_field--None-None] ✓                                         90% █████████ 
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types13-some_field--None-None] ✓                                         92% █████████▍
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types14-some_field--date-time-None] ✓                                    95% █████████▌
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types15-some_field--date-time-] ✓                                        98% █████████▊
 airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_cast_type_if_needed[declared_field_types16-some_field-2020-date-time-2020] ✓                               100% ██████████
=================================================================================================== warnings summary ====================================================================================================
airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[_unsupported_field_type_-expected0]
airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[None-expected1]
airbyte-integrations/connectors/source-hubspot/unit_tests/test_field_type_converting.py::test_bad_field_type_converting[1-expected2]
  /mnt/c/Users/Tino.Merl/Documents/repos/airbyte_dev/airbyte-integrations/connectors/source-hubspot/source_hubspot/api.py:419: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
    logger.warn(f"Unsupported type {field_type} found")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

Results (8.87s):
      40 passed
Integration Tests
python -m pytest integration_tests
Test session starts (platform: linux, Python 3.8.10, pytest 6.1.2, pytest-sugar 0.9.4)
cachedir: .pytest_cache
rootdir: /mnt/c/Users/Tino.Merl/Documents/repos/airbyte_dev, configfile: pytest.ini
plugins: sugar-0.9.4, timeout-1.4.2, requests-mock-1.8.0
collecting ... 

Results (0.24s):
acceptance Tests
python -m pytest integration_tests -p integration_tests.acceptance
Test session starts (platform: linux, Python 3.8.10, pytest 6.1.2, pytest-sugar 0.9.4)
cachedir: .pytest_cache
rootdir: /mnt/c/Users/Tino.Merl/Documents/repos/airbyte_dev, configfile: pytest.ini
plugins: sugar-0.9.4, timeout-1.4.2, requests-mock-1.8.0
collecting ... {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nspec\ninput: /tmp/pytest-of-tino/pytest-15/test_match_expected_inputs0_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_match_expected_inputs0_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_match_expected[inputs0] ✓                           6% ▋         
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_required[inputs0] ✓                                12% █▎        
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_optional[inputs0] ✓                                18% █▊        
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_has_secret[inputs0] ✓                              24% ██▍       
 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_secret_never_in_the_output[inputs0] ✓              29% ██▉       {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nspec\ninput: /tmp/pytest-of-tino/pytest-15/test_oauth_flow_parameters_inp0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_oauth_flow_parameters_inp0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestSpec.test_oauth_flow_parameters[inputs0] ✓                   35% ███▌      {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ncheck --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_check_inputs0_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_check_inputs0_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs0] ✓                             41% ████▎     {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ncheck --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_check_inputs1_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_check_inputs1_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs1] ✓                             47% ████▊     {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ncheck --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_check_inputs2_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_check_inputs2_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs2] ✓                             53% █████▍    {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ncheck --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_check_inputs3_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_check_inputs3_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs3] ✓                             59% █████▉    {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ncheck --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_check_inputs4_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_check_inputs4_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestConnection.test_check[inputs4] ✓                             65% ██████▌   {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ndiscover --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_discover_inputs0_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_discover_inputs0_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_discover[inputs0] ✓                           71% ███████▏  {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \ndiscover --config tap_config.json\ninput: /tmp/pytest-of-tino/pytest-15/test_defined_cursors_exist_in_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_defined_cursors_exist_in_0/run_1/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestDiscovery.test_defined_cursors_exist_in_schema[inputs0] ✓    76% ███████▋  {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_read_inputs0_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_read_inputs0_0/run_1/output"}}
{"type": "LOG", "log": {"level": "ERROR", "message": "\nThe contacts stream has the following schema errors:\n2020-07-23 has invalid datetime format\n\nFailed validating 'format' in schema['properties']['properties']['properties']['p7_kontaktanfrage_datum']:\n    {'format': 'date-time', 'type': ['null', 'string']}\n\nOn instance['properties']['p7_kontaktanfrage_datum']:\n    '2020-07-23'"}}


――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― TestBasicRead.test_read[inputs0] ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――

self = <source_acceptance_test.tests.test_core.TestBasicRead object at 0x7f6a37809d90>, connector_config = SecretDict(******)
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='campaigns', json_schema={'$schema...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])
inputs = BasicReadTestConfig(config_path='secrets/config.json', configured_catalog_path='sample_files/full_refresh_catalog.json', empty_streams={'workflows', 'property_history'}, expect_records=None, validate_schema=True, timeout_seconds=None)
expected_records = [], docker_runner = <source_acceptance_test.utils.connector_runner.ConnectorRunner object at 0x7f6a378015e0>
detailed_logger = <AirbyteNativeLogger detailed_logger acceptance_tests_logs/test_core.py__TestBasicRead__test_read[inputs0].txt (DEBUG)>

    def test_read(
        self,
        connector_config,
        configured_catalog,
        inputs: BasicReadTestConfig,
        expected_records: List[AirbyteMessage],
        docker_runner: ConnectorRunner,
        detailed_logger,
    ):
        output = docker_runner.call_read(connector_config, configured_catalog)
        records = [message.record for message in filter_output(output, Type.RECORD)]
    
        assert records, "At least one record should be read using provided catalog"
    
        if inputs.validate_schema:
>           self._validate_schema(records=records, configured_catalog=configured_catalog)

../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:262: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

records = [AirbyteRecordMessage(stream='campaigns', data={'id': 89263047, 'lastUpdatedTime': 1637571915271, 'appId': 2286, 'appN...red': 2, 'sent': 2, 'click': 2, 'open': 2}, 'type': 'AUTOMATED_EMAIL'}, emitted_at=1637587909000, namespace=None), ...]
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='campaigns', json_schema={'$schema...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])

    @staticmethod
    def _validate_schema(records: List[AirbyteRecordMessage], configured_catalog: ConfiguredAirbyteCatalog):
        """
        Check if data type and structure in records matches the one in json_schema of the stream in catalog
        """
        TestBasicRead._validate_records_structure(records, configured_catalog)
        bar = "-" * 80
        streams_errors = verify_records_schema(records, configured_catalog)
        for stream_name, errors in streams_errors.items():
            errors = map(str, errors.values())
            str_errors = f"\n{bar}\n".join(errors)
            logging.error(f"\nThe {stream_name} stream has the following schema errors:\n{str_errors}")
    
        if streams_errors:
>           pytest.fail(f"Please check your json_schema in selected streams {tuple(streams_errors.keys())}.")
E           Failed: Please check your json_schema in selected streams ('contacts',).

../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:207: Failed
----------------------------------------------------------------------------- Captured log call -----------------------------------------------------------------------------
INFO     root:connector_runner.py:93 Docker run: 
read --config tap_config.json --catalog catalog.json
input: /tmp/pytest-of-tino/pytest-15/test_read_inputs0_0/run_1/input
output: /tmp/pytest-of-tino/pytest-15/test_read_inputs0_0/run_1/output
ERROR    root:test_core.py:204 
The contacts stream has the following schema errors:
2020-07-23 has invalid datetime format

Failed validating 'format' in schema['properties']['properties']['properties']['p7_kontaktanfrage_datum']:
    {'format': 'date-time', 'type': ['null', 'string']}

On instance['properties']['p7_kontaktanfrage_datum']:
    '2020-07-23'

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead.test_read[inputs0] ⨯                               82% ████████▎ {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_read_inputs1_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_read_inputs1_0/run_1/output"}}
{"type": "LOG", "log": {"level": "ERROR", "message": "\nThe contacts stream has the following schema errors:\n2020-07-23 has invalid datetime format\n\nFailed validating 'format' in schema['properties']['properties']['properties']['p7_kontaktanfrage_datum']:\n    {'format': 'date-time', 'type': ['null', 'string']}\n\nOn instance['properties']['p7_kontaktanfrage_datum']:\n    '2020-07-23'"}}


――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― TestBasicRead.test_read[inputs1] ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――

self = <source_acceptance_test.tests.test_core.TestBasicRead object at 0x7f6a319a8880>, connector_config = SecretDict(******)
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='campaigns', json_schema={'$schema...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])
inputs = BasicReadTestConfig(config_path='secrets/config_oauth.json', configured_catalog_path='sample_files/configured_catalog_...treams={'campaigns', 'workflows', 'property_history'}, expect_records=None, validate_schema=True, timeout_seconds=None)
expected_records = [], docker_runner = <source_acceptance_test.utils.connector_runner.ConnectorRunner object at 0x7f6a3733c7c0>
detailed_logger = <AirbyteNativeLogger detailed_logger acceptance_tests_logs/test_core.py__TestBasicRead__test_read[inputs1].txt (DEBUG)>

    def test_read(
        self,
        connector_config,
        configured_catalog,
        inputs: BasicReadTestConfig,
        expected_records: List[AirbyteMessage],
        docker_runner: ConnectorRunner,
        detailed_logger,
    ):
        output = docker_runner.call_read(connector_config, configured_catalog)
        records = [message.record for message in filter_output(output, Type.RECORD)]
    
        assert records, "At least one record should be read using provided catalog"
    
        if inputs.validate_schema:
>           self._validate_schema(records=records, configured_catalog=configured_catalog)

../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:262: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

records = [AirbyteRecordMessage(stream='campaigns', data={'id': 89263047, 'lastUpdatedTime': 1637571915271, 'appId': 2286, 'appN...red': 2, 'sent': 2, 'click': 2, 'open': 2}, 'type': 'AUTOMATED_EMAIL'}, emitted_at=1637588004000, namespace=None), ...]
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='campaigns', json_schema={'$schema...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])

    @staticmethod
    def _validate_schema(records: List[AirbyteRecordMessage], configured_catalog: ConfiguredAirbyteCatalog):
        """
        Check if data type and structure in records matches the one in json_schema of the stream in catalog
        """
        TestBasicRead._validate_records_structure(records, configured_catalog)
        bar = "-" * 80
        streams_errors = verify_records_schema(records, configured_catalog)
        for stream_name, errors in streams_errors.items():
            errors = map(str, errors.values())
            str_errors = f"\n{bar}\n".join(errors)
            logging.error(f"\nThe {stream_name} stream has the following schema errors:\n{str_errors}")
    
        if streams_errors:
>           pytest.fail(f"Please check your json_schema in selected streams {tuple(streams_errors.keys())}.")
E           Failed: Please check your json_schema in selected streams ('contacts',).

../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:207: Failed
----------------------------------------------------------------------------- Captured log call -----------------------------------------------------------------------------
INFO     root:connector_runner.py:93 Docker run: 
read --config tap_config.json --catalog catalog.json
input: /tmp/pytest-of-tino/pytest-15/test_read_inputs1_0/run_1/input
output: /tmp/pytest-of-tino/pytest-15/test_read_inputs1_0/run_1/output
ERROR    root:test_core.py:204 
The contacts stream has the following schema errors:
2020-07-23 has invalid datetime format

Failed validating 'format' in schema['properties']['properties']['properties']['p7_kontaktanfrage_datum']:
    {'format': 'date-time', 'type': ['null', 'string']}

On instance['properties']['p7_kontaktanfrage_datum']:
    '2020-07-23'

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead.test_read[inputs1] ⨯                               88% ████████▉ {"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs0_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs0_0/run_1/output"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs0_0/run_2/input\noutput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs0_0/run_2/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_full_refresh.py::TestFullRefresh.test_sequential_reads[inputs0] ✓         94% █████████▌{"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs1_0/run_1/input\noutput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs1_0/run_1/output"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Docker run: \nread --config tap_config.json --catalog catalog.json\ninput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs1_0/run_2/input\noutput: /tmp/pytest-of-tino/pytest-15/test_sequential_reads_inputs1_0/run_2/output"}}

 airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_full_refresh.py::TestFullRefresh.test_sequential_reads[inputs1] ✓        100% ██████████
========================================================================== short test summary info ==========================================================================
SKIPPED [1] ../../bases/source-acceptance-test/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
FAILED ../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead::test_read[inputs0] - Failed: Please check your json_schema in selected...
FAILED ../../bases/source-acceptance-test/source_acceptance_test/tests/test_core.py::TestBasicRead::test_read[inputs1] - Failed: Please check your json_schema in selected...

Results (605.80s):
      15 passed
       2 failed
         - airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:247 TestBasicRead.test_read[inputs0]
         - airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py:247 TestBasicRead.test_read[inputs1]

…bmissions' of github.com:tinomerl/airbyte into marcos/test-pr-7787
@yevhenii-ldv
Copy link
Contributor

Hi @tinomerl!
In order to increase the execution time of tests, and so that they are not interrupted by a timeout, you can specify the value of timeout_seconds in the acceptance-test-config.yml file. You can see how this is done for the Prestashop connector.

page_filter = "vidOffset"

def list(self, fields) -> Iterable:
properties = self._api.get(f"/properties/v2/contact/properties")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t understand why we can’t just define the contact for this stream as self.entity and not override the list method, it will be executed as in other streams, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entity can't be used because we have to call a legacy API endpoint. The newer API doesn't have the possibility to call for the history of a contact property.
And since the data is also saved in a two column way with property and value as their names the value column would hold different properties and have a mixed type.

yield from self.read(partial(self._api.get, url=self.url), params)

def _transform(self, records: Iterable) -> Iterable:
record: Dict[str, Dict]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
record: Dict[str, Dict]

I think that this line does not carry any information we need, it is rather superfluous than useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Comment on lines 753 to 758
params = {
"limit": 50,
}
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params)
for row in forms:
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
params = {
"limit": 50,
}
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params)
for row in forms:
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params)
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"))
for row in forms:
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"))

There is no need to explicitly pass only the limit in the parameters, since we will get it from the class variable anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Comment on lines 756 to 758
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params)
for row in forms:
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
forms = self.read(partial(self._api.get, url="/marketing/v3/forms"), params)
for row in forms:
record = self.read(partial(self._api.get, url=f"{self.url}/{row['id']}"), params)
forms_stream = FormStream(api=self._api, start_date=str(self._start_date))
for row in forms_stream.list(fields={}):

I think, it will be more clearly, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is better. i added it.

Comment on lines 730 to 731
if key == "lastmodifieddate":
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please leave a comment as to why we are skipping processing for the lastmodifieddate key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is added.

},
"objectTypeId": {
"value": {
"type": ["null", "string"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are u sure, that field which consists Id in name has string type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah HubSpot returns a value which consists only of this string "0-1". Therefore String.

"type": ["null", "string"]
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, add empty string in the end of file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"source-type": {
"type": ["null", "string"]
},
"source-id": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Property can either be changed by workflow, form or manully. The id for either is saved in the source-id . The thin is, that for workflow the id is really a id as in integer. but the id of a form is a mixture of characters and numbers. Therefore it is safer to just cast this as a string.

Copy link
Contributor

@yevhenii-ldv yevhenii-ldv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments.
Also, please, run command ./gradlew format to format project. This will format the changes in the project to the required template.
NOTE: If the formatting process also affects files in which you did not make changes, please roll back the changes to those files to the state in which they were before. Thanks 😃

@tinomerl
Copy link
Contributor Author

Hey @yevhenii-ldv thanks for the Feedback! I'll get to working to after the 19th since i'm not available next week. Just wanted to give you a heads up and let u know. :)

@marcosmarxm
Copy link
Member

@tinomerl I added the label blocked let me know when you return working on this.

@tinomerl
Copy link
Contributor Author

Hey @yevhenii-ldv

sorry for taking so long to work on your feedback. i implemented everything. The tests still throw the same error for contact lists mentioned here. But additionally i get a sequential read error for the property history stream. The problem lies in the calculated fields like hs_time_in_subscriber. I tried filtering it out via the acceptance-test-config.yml. The thing is the field which is calculated is a value and not a field in the final table. so it looks like this :

property value changed_date
hs_time_in_subscriber 123456789 2021-01-01T00:00:00Z

The second call will of course have a higher value for the same property. Is there a way to filter out a field if it contains a specific value?

@marcosmarxm
Copy link
Member

@tinomerl can you solve the conflicts? Other PR add the form submissions stream.

@tinomerl
Copy link
Contributor Author

@marcosmarxm solved the conflicts

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 26, 2022 03:19 Inactive
@marcosmarxm marcosmarxm merged commit b40b303 into airbytehq:master Jan 26, 2022
marcosmarxm added a commit that referenced this pull request Jan 26, 2022
* github schema

* GitHub dockerfile

* formatted

* 🎉Source HubSpot: Adds form_submission and property_history streams (#7787)

* Began working on HubSpot Form Submission Connector

* Added Property History Stream

* Added form_guid to as value to form_submissions_stream.

* Finalized the Form Submission Stream

* Added documentation and test config

* Corrected Version Number

* updated version number to 0.1.25

* removed or none worked on tests

* Changed code due to review comments & merges

* readded Propertyhistory after merging

* bump connector version

Co-authored-by: Tino Merl <tino.merl@park-sieben.com>
Co-authored-by: Marcos Marx <marcosmarxm@gmail.com>

* bump connector version

Co-authored-by: Tino Merl <35485536+tinomerl@users.noreply.github.com>
Co-authored-by: Tino Merl <tino.merl@park-sieben.com>
Co-authored-by: Marcos Marx <marcosmarxm@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation blocked community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants