Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source FB Marketing: deprecate INSIGHTS_DAYS_PER_JOB from connector's specification. #8234

Closed

Conversation

bazarnov
Copy link
Collaborator

@bazarnov bazarnov commented Nov 24, 2021

What

#8027

How

  • edited source.py
  • corrected Readme.md

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here


This change is Reviewable

@bazarnov bazarnov self-assigned this Nov 24, 2021
@bazarnov bazarnov linked an issue Nov 24, 2021 that may be closed by this pull request
@github-actions github-actions bot added the area/connectors Connector related issues label Nov 24, 2021
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 15:13 Inactive
@bazarnov
Copy link
Collaborator Author

bazarnov commented Nov 24, 2021

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1500157622
❌ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1500157622
🐛 https://gradle.com/s/hn5pthx33go4a
🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1500157622
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1500157622
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        75      8    89%
	 source_acceptance_test/conftest.py                     108    108     0%
	 source_acceptance_test/plugin.py                        47     47     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              200     94    53%
	 source_acceptance_test/tests/test_full_refresh.py       38     27    29%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  41     24    41%
	 source_acceptance_test/utils/compare.py                 62     25    60%
	 source_acceptance_test/utils/connector_runner.py        82     49    40%
	 source_acceptance_test/utils/json_schema_helper.py     115     14    88%
	 ------------------------------------------------------------------------
	 TOTAL                                                  896    440    51%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     17    77%
	 source_facebook_marketing/async_job.py      92     58    37%
	 source_facebook_marketing/common.py         37     11    70%
	 source_facebook_marketing/source.py        112     65    42%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    231    59%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     18    76%
	 source_facebook_marketing/async_job.py      92      1    99%
	 source_facebook_marketing/common.py         37      1    97%
	 source_facebook_marketing/source.py        112     72    36%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    172    69%

@bazarnov bazarnov temporarily deployed to more-secrets November 24, 2021 16:01 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 16:03 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 18:39 Inactive
@bazarnov bazarnov linked an issue Nov 24, 2021 that may be closed by this pull request
@bazarnov bazarnov temporarily deployed to more-secrets November 25, 2021 10:19 Inactive
@bazarnov bazarnov temporarily deployed to more-secrets November 25, 2021 11:10 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 11:26 Inactive
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Nov 25, 2021
@bazarnov bazarnov temporarily deployed to more-secrets November 25, 2021 11:32 Inactive
@bazarnov
Copy link
Collaborator Author

bazarnov commented Nov 25, 2021

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1503640804
❌ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1503640804
🐛 https://gradle.com/s/4jjbfhxzfzjka
🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1503640804
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1503640804
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        75      8    89%
	 source_acceptance_test/conftest.py                     108    108     0%
	 source_acceptance_test/plugin.py                        47     47     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              200     94    53%
	 source_acceptance_test/tests/test_full_refresh.py       38     27    29%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  41     24    41%
	 source_acceptance_test/utils/compare.py                 62     25    60%
	 source_acceptance_test/utils/connector_runner.py        82     49    40%
	 source_acceptance_test/utils/json_schema_helper.py     115     14    88%
	 ------------------------------------------------------------------------
	 TOTAL                                                  896    440    51%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     17    77%
	 source_facebook_marketing/async_job.py      92     58    37%
	 source_facebook_marketing/common.py         37     11    70%
	 source_facebook_marketing/source.py        112     65    42%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    231    59%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     18    76%
	 source_facebook_marketing/async_job.py      92      1    99%
	 source_facebook_marketing/common.py         37      1    97%
	 source_facebook_marketing/source.py        112     72    36%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    172    69%

@bazarnov bazarnov temporarily deployed to more-secrets November 25, 2021 12:16 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 12:17 Inactive
@bazarnov bazarnov temporarily deployed to more-secrets November 26, 2021 09:11 Inactive
@bazarnov
Copy link
Collaborator Author

bazarnov commented Nov 26, 2021

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1506881247
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1506881247
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        75      8    89%
	 source_acceptance_test/conftest.py                     108    108     0%
	 source_acceptance_test/plugin.py                        47     47     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              200     94    53%
	 source_acceptance_test/tests/test_full_refresh.py       38     27    29%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  41     24    41%
	 source_acceptance_test/utils/compare.py                 62     25    60%
	 source_acceptance_test/utils/connector_runner.py        82     49    40%
	 source_acceptance_test/utils/json_schema_helper.py     115     14    88%
	 ------------------------------------------------------------------------
	 TOTAL                                                  896    440    51%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     17    77%
	 source_facebook_marketing/async_job.py      92     58    37%
	 source_facebook_marketing/common.py         37     11    70%
	 source_facebook_marketing/source.py        112     65    42%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    231    59%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                     Stmts   Miss  Cover
	 ------------------------------------------------------------
	 source_facebook_marketing/__init__.py        2      0   100%
	 source_facebook_marketing/api.py            75     18    76%
	 source_facebook_marketing/async_job.py      92      1    99%
	 source_facebook_marketing/common.py         37      1    97%
	 source_facebook_marketing/source.py        112     72    36%
	 source_facebook_marketing/streams.py       239     80    67%
	 ------------------------------------------------------------
	 TOTAL                                      557    172    69%

@bazarnov bazarnov temporarily deployed to more-secrets November 26, 2021 09:13 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 26, 2021 09:14 Inactive
@bazarnov bazarnov requested a review from keu November 26, 2021 12:40
@keu
Copy link
Contributor

keu commented Nov 26, 2021

@keu fair enough. What's the path forward for deprecating this option, then?

@sherifnada @bazarnov I can't give a perfect solution right now, if I had one, I wouldn't add such a configuration option. The problem with any hard-coded default value is that the user or us can't tune the value.

If we set this value too big we might have too many fails on accounts with a large amount of data and vise versa.
Some connector (don't remember the name) that Dmytro implemented has dynamic slices that adopt the amount of data it fetches to the percentage of fails. Maybe we can decrease the size of the interval each time AsyncJob fails. The challenge here is new jobs for missing subslices.

let say we have jobs with interval = 5
slices/jobs:
ddddd ddddd ddddd

the second slice failed, we retry with interval 2, but 3rd slice is still running, and we need to add missing jobs for the rest of 2nd slice

@bazarnov bazarnov temporarily deployed to more-secrets November 30, 2021 13:21 Inactive
@bazarnov
Copy link
Collaborator Author

@keu fair enough. What's the path forward for deprecating this option, then?

@sherifnada @bazarnov I can't give a perfect solution right now, if I had one, I wouldn't add such a configuration option. The problem with any hard-coded default value is that the user or us can't tune the value.

If we set this value too big we might have too many fails on accounts with a large amount of data and vise versa. Some connector (don't remember the name) that Dmytro implemented has dynamic slices that adopt the amount of data it fetches to the percentage of fails. Maybe we can decrease the size of the interval each time AsyncJob fails. The challenge here is new jobs for missing subslices.

let say we have jobs with interval = 5 slices/jobs: ddddd ddddd ddddd

the second slice failed, we retry with interval 2, but 3rd slice is still running, and we need to add missing jobs for the rest of 2nd slice

@keu What are the root causes of the possible job failure with date_slices? I've tested using the values of 300 and 1000 for the insights days per job - all are successfully processed on our test account.
Therefore we can try to stay on value of 3 or 5 as a default value. WDYT?

@keu
Copy link
Contributor

keu commented Dec 1, 2021

@keu fair enough. What's the path forward for deprecating this option, then?

@sherifnada @bazarnov I can't give a perfect solution right now, if I had one, I wouldn't add such a configuration option. The problem with any hard-coded default value is that the user or us can't tune the value.
If we set this value too big we might have too many fails on accounts with a large amount of data and vise versa. Some connector (don't remember the name) that Dmytro implemented has dynamic slices that adopt the amount of data it fetches to the percentage of fails. Maybe we can decrease the size of the interval each time AsyncJob fails. The challenge here is new jobs for missing subslices.
let say we have jobs with interval = 5 slices/jobs: ddddd ddddd ddddd
the second slice failed, we retry with interval 2, but 3rd slice is still running, and we need to add missing jobs for the rest of 2nd slice

@keu What are the root causes of the possible job failure with date_slices? I've tested using the values of 300 and 1000 for the insights days per job - all are successfully processed on our test account. Therefore we can try to stay on value of 3 or 5 as a default value. WDYT?

Since we have a very little amount of data testing on our test account is not even close to the production accounts of users.
I don't know an exact root cause, my guess from my experience and singer implementation the more data we need to fetch the more chance that job will fail (probably memory limitation or something) so decreasing insights days per job to 1 day theoretically should help (and helps), but not always (my guess it can be solved by querying fewer aggregation fields).

@avida is currently investigating this issue with the account provided by our customer (Cart).

@sherifnada
Copy link
Contributor

@bazarnov let's postpone deprecating the inisght days per job until #8385 is merged

@keu
Copy link
Contributor

keu commented Jan 19, 2022

The deprecation was done in the scope of #8385 because it also touches the logic.
The only missing part is the re-ordering of columns. I think it makes sense to move it there as well and close this PR.

@bazarnov
Copy link
Collaborator Author

Closed with reference to this: #8234 (comment)

@bazarnov bazarnov closed this Jan 19, 2022
keu pushed a commit that referenced this pull request Jan 22, 2022
keu added a commit that referenced this pull request Feb 17, 2022
* Facebook Marketing performance improvement

* add comments and little refactoring

* fix integration tests with the new config

* improve job status handling, limit concurrency to 10

* fix campaign jobs, refactor manager

* big refactoring of async jobs, support random order of slices

* update source _read_incremental to hook new state logic

* fix issues with timeout

* remove debugging and clean up, improve retry logic

* merge changes from #8234

* fix call super _read_increment

* generalize batch execution, add use_batch flag

* improve coverage, do some refactoring of spec

* update test, remove overrides of source

* add split by AdSet

* add smaller insights

* fix end_date < start_date case

* add account_id to PK

* add notes

* fix new streams

* fix reversed incremental stream

* update spec.json for SAT

* upgrade CDK and bump version

Co-authored-by: Dmytro Rezchykov <dmitry.rezchykov@zazmic.com>
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
@bazarnov bazarnov deleted the bazarnov/8027-fb-marketing-insigths-days-per-job branch September 20, 2022 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants