Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉Source Google Analytics Data Api: code refactoring #15229

Merged
merged 6 commits into from
Aug 17, 2022

Conversation

roman-yermilov-gl
Copy link
Contributor

What

Refactoring of the Google analytics data API source. Use REST API calls instead of library and support a default set of reports

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@CLAassistant
Copy link

CLAassistant commented Aug 3, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the area/connectors Connector related issues label Aug 3, 2022
@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from 54bd978 to befbc31 Compare August 4, 2022 14:06
@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Aug 4, 2022

/test connector=connectors/source-google-analytics-data-api

🕑 connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2798078547
❌ connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2798078547
🐛 https://gradle.com/s/3tjtwplgwisng

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestConnection::test_check[inputs0] - AssertionError: as...
FAILED test_core.py::TestDiscovery::test_discover[inputs0] - docker.errors.Co...
ERROR test_core.py::TestDiscovery::test_defined_cursors_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_refs_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-allOf]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-not]
ERROR test_core.py::TestDiscovery::test_primary_keys_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_streams_has_sync_modes[inputs0] - doc...
ERROR test_core.py::TestDiscovery::test_additional_properties_is_true[inputs0]
ERROR test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contain...
ERROR test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======== 2 failed, 14 passed, 1 skipped, 1 warning, 9 errors in 23.10s =========

@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Aug 4, 2022

/test connector=connectors/source-google-analytics-data-api

🕑 connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2798571489
✅ connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2798571489
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        81      6    93%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              328    121    63%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  995    261    74%
Name                                                Stmts   Miss  Cover
-----------------------------------------------------------------------
source_google_analytics_data_api/__init__.py            2      0   100%
source_google_analytics_data_api/source.py            159     49    69%
source_google_analytics_data_api/utils.py               8      3    62%
source_google_analytics_data_api/authenticator.py      44     23    48%
-----------------------------------------------------------------------
TOTAL                                                 213     75    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
============= 25 passed, 1 skipped, 1 warning in 263.06s (0:04:23) =============

@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from befbc31 to cd52596 Compare August 5, 2022 17:54
@roman-yermilov-gl roman-yermilov-gl requested a review from a team as a code owner August 5, 2022 17:54
@github-actions github-actions bot added area/platform issues related to the platform area/frontend Related to the Airbyte webapp labels Aug 5, 2022
@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 5, 2022 17:56 Inactive
@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from cd52596 to c300efe Compare August 5, 2022 19:25
@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 5, 2022 19:27 Inactive
@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from c300efe to 26c5adb Compare August 5, 2022 21:52
@github-actions github-actions bot removed area/platform issues related to the platform area/frontend Related to the Airbyte webapp labels Aug 5, 2022
@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 5, 2022 21:53 Inactive
@lazebnyi lazebnyi changed the title Source: google analytics data api refactoring 🎉Source Google Analytics Data Api: code refactoring Aug 9, 2022
@lazebnyi lazebnyi requested review from edmundito, bazarnov, grubberr and Phlair and removed request for marcosmarxm August 9, 2022 12:17
@edmundito edmundito removed their request for review August 9, 2022 12:23
Copy link
Contributor

@Phlair Phlair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests + Oauth implementation looking great!

Some open questions...

Comment on lines +1 to +19
docker_image := airbyte/$(notdir $(CURDIR)):dev

run-build:
docker build . -t ${docker_image}

spec:
@docker run --rm $(docker_image) spec | jq

check:
@docker run --rm -v $(PWD)/secrets:/secrets $(docker_image) check --config /secrets/config.json | jq

discover:
@docker run --rm -v $(PWD)/secrets:/secrets $(docker_image) discover --config /secrets/config.json | jq

read:
@docker run --rm -v $(PWD)/secrets:/secrets -v $(PWD)/integration_tests:/integration_tests $(docker_image) read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json | jq

unittest-local:
@python -m pytest unit_tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this Makefile for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to have it in repository. I created it for running routine operations more quickly. It is only for development purpose and for convenience. For example check is a long command but I am just typing make check in my console

@@ -5,7 +5,7 @@

from setuptools import find_packages, setup

MAIN_REQUIREMENTS = ["airbyte-cdk~=0.1", "google-analytics-data==0.11.2"]
MAIN_REQUIREMENTS = ["airbyte-cdk~=0.1", "google-analytics-data==0.11.2", "PyJWT", "cryptography", "requests"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib versions should be pinned please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

from source_google_analytics_data_api import utils


class GoogleServiceKeyAuthenticator(requests.auth.AuthBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this completely unique such that we can't use any of the cdk authenticator frameworks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any proper base class in cdk for service authentication. In some other connectors they are using OAuth class as a base class and re-implementing methods but it is basically the same as I do using requests.auth.AuthBase. It is more clea[n/r] I think

Comment on lines +1 to +14
import calendar
import datetime


def datetime_to_secs(dt: datetime.datetime) -> int:
return calendar.timegm(dt.utctimetuple())


def string_to_date(d: str, f: str = "%Y-%m-%d") -> datetime.date:
return datetime.datetime.strptime(d, f).date()


def date_to_string(d: datetime.date, f: str = "%Y-%m-%d") -> str:
return d.strftime(f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we generally use pendulum library for datetime operations (it's basically datetime++)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will keep it in mind. Should I rewrite these utils for using pendulum or they are good as is for just doing formatting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine if it works

@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 10, 2022 09:57 Inactive
@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Aug 14, 2022

/test connector=connectors/source-google-analytics-data-api

🕑 connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2853857417
✅ connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2853857417
Python tests coverage:

Name                                                Stmts   Miss  Cover
-----------------------------------------------------------------------
source_google_analytics_data_api/__init__.py            2      0   100%
source_google_analytics_data_api/source.py            170     48    72%
source_google_analytics_data_api/utils.py               8      3    62%
source_google_analytics_data_api/authenticator.py      44     23    48%
-----------------------------------------------------------------------
TOTAL                                                 224     74    67%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-214
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1321    463    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.0.2.
================== 25 passed, 2 skipped in 278.25s (0:04:38) ===================

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Aug 15, 2022
@grubberr grubberr temporarily deployed to more-secrets August 15, 2022 13:21 Inactive
@grubberr
Copy link
Contributor

@roman-yermilov-gl
I think we don't need this
airbyte-integrations/connectors/source-google-analytics-data-api/integration_tests/catalog.json

@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from 0417db9 to c861452 Compare August 15, 2022 16:13
@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 15, 2022 16:16 Inactive
@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Aug 15, 2022

/test connector=connectors/source-google-analytics-data-api

🕑 connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2861941416
✅ connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2861941416
Python tests coverage:

Name                                                Stmts   Miss  Cover
-----------------------------------------------------------------------
source_google_analytics_data_api/__init__.py            2      0   100%
source_google_analytics_data_api/source.py            170     48    72%
source_google_analytics_data_api/utils.py               8      3    62%
source_google_analytics_data_api/authenticator.py      44     23    48%
-----------------------------------------------------------------------
TOTAL                                                 224     74    67%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1321    463    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.0.2.
================== 25 passed, 2 skipped in 284.08s (0:04:44) ===================

@roman-yermilov-gl roman-yermilov-gl force-pushed the ryermilov/source-google-analytics-data-api branch from c861452 to 4e5ddd2 Compare August 16, 2022 08:32
@roman-yermilov-gl roman-yermilov-gl temporarily deployed to more-secrets August 16, 2022 08:35 Inactive
@grubberr grubberr temporarily deployed to more-secrets August 17, 2022 09:14 Inactive
@grubberr
Copy link
Contributor

grubberr commented Aug 17, 2022

/test connector=connectors/source-google-analytics-data-api

🕑 connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2874237755
✅ connectors/source-google-analytics-data-api https://github.com/airbytehq/airbyte/actions/runs/2874237755
Python tests coverage:

Name                                                Stmts   Miss  Cover
-----------------------------------------------------------------------
source_google_analytics_data_api/__init__.py            2      0   100%
source_google_analytics_data_api/source.py            170     48    72%
source_google_analytics_data_api/utils.py               8      3    62%
source_google_analytics_data_api/authenticator.py      44     23    48%
-----------------------------------------------------------------------
TOTAL                                                 224     74    67%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1321    463    65%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:60: Skipping TestIncremental.test_two_sequential_reads because not found in the config
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.0.2.
================== 25 passed, 2 skipped in 369.73s (0:06:09) ===================

@roman-yermilov-gl
Copy link
Contributor Author

roman-yermilov-gl commented Aug 17, 2022

/publish connector=connectors/source-google-analytics-data-api

🕑 Publishing the following connectors:
connectors/source-google-analytics-data-api
https://github.com/airbytehq/airbyte/actions/runs/2875050214


Connector Did it publish? Were definitions generated?
connectors/source-google-analytics-data-api

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@roman-yermilov-gl roman-yermilov-gl merged commit e40ee3f into master Aug 17, 2022
@roman-yermilov-gl roman-yermilov-gl deleted the ryermilov/source-google-analytics-data-api branch August 17, 2022 12:49
rodireich pushed a commit that referenced this pull request Aug 25, 2022
* Source: google analytics data api refactoring

* google-analytics-data-api.md

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* updated spec: added custom reports field

* doclint

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* auto-bump connector version [ci skip]

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/google-analytics-data-api
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source Google Analytics v4: allow run GA account without view id
9 participants