Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 New Source: PyPI [low-code cdk] #18632

Merged
merged 21 commits into from
Nov 10, 2022
Merged

Conversation

NoelJacob
Copy link
Contributor

@NoelJacob NoelJacob commented Oct 28, 2022

What

Data from PyPI as source

How

Used Low-code API

Recommended reading order

  1. spec.yaml
  2. pypi.yaml

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Tests

Acceptance
Test session starts (platform: linux, Python 3.9.11, pytest 6.2.5, pytest-sugar 0.9.5)
rootdir: /test_input
plugins: hypothesis-6.54.6, cov-3.0.0, mock-3.6.1, sugar-0.9.5, timeout-1.4.2, requests-mock-1.9.3
collecting ... 
 test_core.py ✓✓✓✓✓✓✓✓✓✓s✓✓✓✓✓✓✓✓✓✓✓s✓✓                                                                                                                    96% █████████▋
 test_full_refresh.py ✓                                                                                                                                   100% ██████████

======================================================================== short test summary info ========================================================================
SKIPPED [1] source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: This connector does not implement incremental sync
SKIPPED [1] source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.

Results (66.53s):
      24 passed
       2 skipped

* Init

* Update acceptance-test-config.yml

* Update

* Update acceptance-test-config.yml
@CLAassistant
Copy link

CLAassistant commented Oct 28, 2022

CLA assistant check
All committers have signed the CLA.

@NoelJacob
Copy link
Contributor Author

NoelJacob commented Oct 28, 2022

Please do not merge, I have documentation to do.
Edit: Done!

@NoelJacob
Copy link
Contributor Author

@koconder secrets/config is not present in the repo. Should be why it's failing. I'll try to change the acceptance-test file to point to sample config

@vincentkoc
Copy link
Contributor

@NoelJacob this is normal as there is no test config file for this connection yet on airbyte side. Please hang tight as i review.

@airbytehq airbytehq deleted a comment from vincentkoc Oct 31, 2022
Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @NoelJacob, Marcos from Airbyte here 👋 . We received more than 25 new contributions along the weekend. One is yours 🎉 thank so much for! Our team is limited and maybe the review process can take longer than expected. As described in the Airbyte's Hacktoberfest your contribution was submitted before November 2nd and it is eligible to win the prize. The review process will validate other requirements. I ask to you patience until someone from the team review it.

Because I reviewed some contributions for Hacktoberfest so far I saw some common patterns you can check in advance:

  • Make sure you have added connector documentation to /docs/integrations/
  • Remove the file catalog from /integration_tests
  • Edit the sample_config.json inside /integration_tests
  • For the configured_catalog you can use only json_schema: {}
  • Add title to all properties in the spec.yaml
  • Make sure the documentationUrl in the spec.yaml redirect to Airbyte's future connector page, eg: connector Airtable the documentationUrl: https://docs.airbyte.com/integrations/sources/airtable
  • Review now new line at EOF (end-of-file) for all files.

If possible send to me a DM in Slack with the tests credentials, this process will make easier to us run integration tests and publish your connector. If you only has production keys, make sure to create a bootstrap.md explaining how to get the keys.

@marcosmarxm marcosmarxm changed the title 🎉 New Source: PyPI 🎉 New Source: PyPI [low-code cdk] Oct 31, 2022
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Nov 2, 2022
@NoelJacob NoelJacob requested review from marcosmarxm and removed request for vincentkoc November 2, 2022 06:39
Copy link
Contributor

@vincentkoc vincentkoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NoelJacob would be good if you look at some of the minor changes. I will go ahead and do a quick local acceptance test and see where we go from there. Thanks again for the contribution

@vincentkoc
Copy link
Contributor

Test results:

=================================== FAILURES ===================================
_______________________ TestBasicRead.test_read[inputs0] _______________________

self = <source_acceptance_test.tests.test_core.TestBasicRead object at 0xffffb6af6070>
connector_config = SecretDict(******)
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='project', json_schema={'$schema':...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])
inputs = BasicReadTestConfig(config_path='secrets/config.json', configured_catalog_path='integration_tests/configured_catalog.j...ds=True), validate_schema=True, validate_data_points=False, expect_trace_message_on_failure=True, timeout_seconds=None)
expected_records_by_stream = defaultdict(<class 'list'>, {'stats': [{'top_packages': {'CodeIntel': {'size': 23767329521}, 'MegEngine': {'size': 151...82ecf9e0e43af740c79ccd/sampleproject-2.0.0.tar.gz', 'yanked': False, 'yanked_reason': None}], 'vulnerabilities': []}]})
docker_runner = <source_acceptance_test.utils.connector_runner.ConnectorRunner object at 0xffffb6c35bb0>
detailed_logger = <Logger detailed_logger /test_input/acceptance_tests_logs/test_core.py__TestBasicRead__test_read[inputs0].txt (DEBUG)>

    def test_read(
        self,
        connector_config,
        configured_catalog,
        inputs: BasicReadTestConfig,
        expected_records_by_stream: MutableMapping[str, List[MutableMapping]],
        docker_runner: ConnectorRunner,
        detailed_logger,
    ):
        output = docker_runner.call_read(connector_config, configured_catalog)
        records = [message.record for message in filter_output(output, Type.RECORD)]
    
        assert records, "At least one record should be read using provided catalog"
    
        if inputs.validate_schema:
>           self._validate_schema(records=records, configured_catalog=configured_catalog)

/usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:480: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

records = [AirbyteRecordMessage(namespace=None, stream='project', data={'info': {'author': 'Hipages Data Team', 'author_email': ...ze': 27927907201}, 'xpress': {'size': 20577744315}}, 'total_packages_size': 14300830071491}, emitted_at=1667482761421)]
configured_catalog = ConfiguredAirbyteCatalog(streams=[ConfiguredAirbyteStream(stream=AirbyteStream(name='project', json_schema={'$schema':...l_refresh'>, cursor_field=None, destination_sync_mode=<DestinationSyncMode.overwrite: 'overwrite'>, primary_key=None)])

    @staticmethod
    def _validate_schema(records: List[AirbyteRecordMessage], configured_catalog: ConfiguredAirbyteCatalog):
        """
        Check if data type and structure in records matches the one in json_schema of the stream in catalog
        """
        TestBasicRead._validate_records_structure(records, configured_catalog)
        bar = "-" * 80
        streams_errors = verify_records_schema(records, configured_catalog)
        for stream_name, errors in streams_errors.items():
            errors = map(str, errors.values())
            str_errors = f"\n{bar}\n".join(errors)
            logging.error(f"\nThe {stream_name} stream has the following schema errors:\n{str_errors}")
    
        if streams_errors:
>           pytest.fail(f"Please check your json_schema in selected streams {tuple(streams_errors.keys())}.")
E           Failed: Please check your json_schema in selected streams ('project', 'release').

/usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:382: Failed
----------------------------- Captured stdout call -----------------------------
{"type": "LOG", "log": {"level": "ERROR", "message": "\nThe project stream has the following schema errors:\nNone is not of type 'string'\n\nFailed validating 'type' in schema['properties']['info']['properties']['platform']:\n    {'description': '[DEPRECATED]', 'type': 'string'}\n\nOn instance['info']['platform']:\n    None\n--------------------------------------------------------------------------------\nNone is not of type 'object'\n\nFailed validating 'type' in schema['properties']['info']['properties']['project_urls']:\n    {'description': 'Additional URLs that are relevant to your project. '\n                    'Corresponds to '\n                    'https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use',\n     'patternProperties': {'.*': {'type': 'string'}},\n     'type': 'object'}\n\nOn instance['info']['project_urls']:\n    None"}}
{"type": "LOG", "log": {"level": "ERROR", "message": "\nThe release stream has the following schema errors:\nNone is not of type 'string'\n\nFailed validating 'type' in schema['properties']['info']['properties']['platform']:\n    {'description': '[DEPRECATED]', 'type': 'string'}\n\nOn instance['info']['platform']:\n    None\n--------------------------------------------------------------------------------\nNone is not of type 'object'\n\nFailed validating 'type' in schema['properties']['info']['properties']['project_urls']:\n    {'description': 'Additional URLs that are relevant to your project. '\n                    'Corresponds to '\n                    'https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use',\n     'patternProperties': {'.*': {'type': 'string'}},\n     'type': 'object'}\n\nOn instance['info']['project_urls']:\n    None"}}
------------------------------ Captured log call -------------------------------
ERROR    root:test_core.py:379 
The project stream has the following schema errors:
None is not of type 'string'

Failed validating 'type' in schema['properties']['info']['properties']['platform']:
    {'description': '[DEPRECATED]', 'type': 'string'}

On instance['info']['platform']:
    None
--------------------------------------------------------------------------------
None is not of type 'object'

Failed validating 'type' in schema['properties']['info']['properties']['project_urls']:
    {'description': 'Additional URLs that are relevant to your project. '
                    'Corresponds to '
                    'https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use',
     'patternProperties': {'.*': {'type': 'string'}},
     'type': 'object'}

On instance['info']['project_urls']:
    None
ERROR    root:test_core.py:379 
The release stream has the following schema errors:
None is not of type 'string'

Failed validating 'type' in schema['properties']['info']['properties']['platform']:
    {'description': '[DEPRECATED]', 'type': 'string'}

On instance['info']['platform']:
    None
--------------------------------------------------------------------------------
None is not of type 'object'

Failed validating 'type' in schema['properties']['info']['properties']['project_urls']:
    {'description': 'Additional URLs that are relevant to your project. '
                    'Corresponds to '
                    'https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use',
     'patternProperties': {'.*': {'type': 'string'}},
     'type': 'object'}

On instance['info']['project_urls']:
    None
=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - Failed: Please check...
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: This connector does not implement incremental sync
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.
=================== 1 failed, 23 passed, 3 skipped in 27.14s ===================

@vincentkoc
Copy link
Contributor

All tests are now passing, I made some changes to make it all work.
@NoelJacob we will get this finalised and published as soon as we can.


=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: This connector does not implement incremental sync
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.
======================== 24 passed, 3 skipped in 26.25s ========================

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

See https://docs.gradle.org/7.4/userguide/command_line_interface.html#sec:command_line_warnings

Execution optimizations have been disabled for 1 invalid unit(s) of work during this build to ensure correctness.
Please consult deprecation warnings for more details.

BUILD SUCCESSFUL in 3m 11s
46 actionable tasks: 21 executed, 25 up-to-date

@marcosmarxm as this has no auth, I’m using the following config:

{
    "project_name": "airbyte-cdk",
    "version": "0.5.2"
}

@sajarin sajarin added the bounty-XL Maintainer program: claimable extra large bounty PR label Nov 7, 2022
@vincentkoc
Copy link
Contributor

vincentkoc commented Nov 7, 2022

/test connector=connectors/source-pypi

🕑 connectors/source-pypi https://github.com/airbytehq/airbyte/actions/runs/3414755524
✅ connectors/source-pypi https://github.com/airbytehq/airbyte/actions/runs/3414755524
Python tests coverage:

	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       133      3    98%   87, 93, 230
	 source_acceptance_test/conftest.py                     196     97    51%   35, 41-43, 48, 54, 60, 66, 72-74, 80-95, 100, 105-107, 113-115, 121-122, 127-128, 133, 139, 148-157, 163-168, 232, 238, 244-250, 258-263, 271-284, 289-295, 302-313, 320-336
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              329    106    68%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 377-379, 382, 447-455, 484-485, 491, 494, 530-540, 553-578
	 source_acceptance_test/tests/test_incremental.py       145     20    86%   21-23, 29-31, 36-43, 48-61, 224
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     10    87%   15-16, 24-30, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/config_migration.py        23     23     0%   5-37
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-68, 71-73, 76-78, 81-83, 86-88, 91-93, 96-114, 148-150
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1479    376    75%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: This connector does not implement incremental sync
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:51: The previous connector image could not be retrieved.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:229: The previous connector image could not be retrieved.
======================== 24 passed, 3 skipped in 23.00s ========================

@vincentkoc
Copy link
Contributor

vincentkoc commented Nov 7, 2022

/publish connector=connectors/source-pypi

🕑 Publishing the following connectors:
connectors/source-pypi
https://github.com/airbytehq/airbyte/actions/runs/3414949927


Connector Did it publish? Were definitions generated?
connectors/source-pypi

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@sajarin sajarin merged commit 22f03e3 into airbytehq:master Nov 10, 2022
akashkulk pushed a commit that referenced this pull request Dec 2, 2022
* PyPI (#11)

* Init

* Update acceptance-test-config.yml

* Update

* Update acceptance-test-config.yml

* Add requested changes and docs

* Update acceptance.py

* Update acceptance-test-config.yml

* Update setup.py

* fix EOFL

* Update README.md

* Update README.md

* changes to pass tests

* Update source_definitions.yaml

* Update source_specs.yaml

Co-authored-by: Vincent Koc <koconder@users.noreply.github.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation bounty bounty-XL Maintainer program: claimable extra large bounty PR community connectors/source/pypi hacktober low-code
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

7 participants