Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak dbt configuration parameters to reasonable values #9846

Merged
merged 3 commits into from
Jan 28, 2022

Conversation

ChristopheDuong
Copy link
Contributor

@ChristopheDuong ChristopheDuong commented Jan 27, 2022

What

Normalization parameters for multi-threading were set by default to 32

In the past (before implementing internal staging for snowflake, we've seen time-out errors from snowflake trying to shut us off from spamming their API (with insert writes).

So, following up on https://github.com/airbytehq/oncall/issues/120, I'm wondering if dbt is sometimes hitting some similar thresholds/limits randomly?

For instance, In the docs, I've seen recommendations or people mentioning using 5 to 10 threads:

From dbt docs:

threads: [between 1 and 8]
https://docs.getdbt.com/reference/warehouse-profiles/snowflake-profile

From Snowflake docs:

In Snowflake the parameter MAX_CONCURRENCY_LEVEL defines the maximum number of parallel or concurrent statements a warehouse can execute.
By default the value is set to 8. This means at any given point of time the warehouse will allow a maximum of 8 queries to run concurrently if the resources on that warehouse can fit all of them simultaneously.
In reality, we can have more than 8 concurrent queries as well. But it depends on the factors such as the complexity of the queries, their resource consumptions etc.
https://community.snowflake.com/s/article/Warehouse-Concurrency-and-Statement-Timeout-Parameters

How

This PR is tuning the parameters down to match recommendations or default values in dbt docs for some destinations.

It also tunes dbt to take advantage of some retries mechanisms with backoff timeouts.

Hopefully, these small changes would improve overall performances and stability
(atm we don't have benchmarks to verify the impacts of these changes, right @tuliren?)

@ChristopheDuong
Copy link
Contributor Author

ChristopheDuong commented Jan 27, 2022

/test connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1757685528
❌ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1757685528
🐛

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 27, 2022 17:57 Inactive
Copy link
Contributor

@tuliren tuliren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atm we don't have benchmarks to verify the impacts of these changes, right

No, we don't now. It's fairly easy to create a new one by creating one mock source that 1) has the same catalog of the problematic real source, and 2) make it emit the same number of records that's giving snowflake normalization trouble. Feel free to do that in the [Benchmark] Destination Warehouse workspace.

@ChristopheDuong
Copy link
Contributor Author

ChristopheDuong commented Jan 28, 2022

/test connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1760504908
✅ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1760504908
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                       Stmts   Miss  Cover
	 --------------------------------------------------------------
	 base_python/__init__.py                       13      0   100%
	 base_python/catalog_helpers.py                10      6    40%
	 base_python/cdk/__init__.py                    0      0   100%
	 base_python/cdk/abstract_source.py            89     64    28%
	 base_python/cdk/streams/__init__.py            0      0   100%
	 base_python/cdk/streams/auth/__init__.py       0      0   100%
	 base_python/cdk/streams/auth/core.py           8      1    88%
	 base_python/cdk/streams/auth/jwt.py            5      5     0%
	 base_python/cdk/streams/auth/oauth.py         37     26    30%
	 base_python/cdk/streams/auth/token.py          9      4    56%
	 base_python/cdk/streams/core.py               63     32    49%
	 base_python/cdk/streams/exceptions.py         10      2    80%
	 base_python/cdk/streams/http.py               67     33    51%
	 base_python/cdk/streams/rate_limiting.py      30     14    53%
	 base_python/cdk/utils/__init__.py              0      0   100%
	 base_python/cdk/utils/casing.py                4      0   100%
	 base_python/cdk/utils/event_timing.py         47      3    94%
	 base_python/client.py                         56     33    41%
	 base_python/entrypoint.py                     70     56    20%
	 base_python/integration.py                    52     25    52%
	 base_python/logger.py                         33     15    55%
	 base_python/schema_helpers.py                 56     41    27%
	 base_python/source.py                         51     34    33%
	 main_dev.py                                    3      3     0%
	 --------------------------------------------------------------
	 TOTAL                                        713    397    44%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                              Stmts   Miss  Cover
	 -------------------------------------------------------------------------------------
	 main_dev_transform_catalog.py                                         3      3     0%
	 main_dev_transform_config.py                                          3      3     0%
	 normalization/__init__.py                                             4      0   100%
	 normalization/destination_type.py                                    13      0   100%
	 normalization/transform_catalog/__init__.py                           2      0   100%
	 normalization/transform_catalog/catalog_processor.py                143     77    46%
	 normalization/transform_catalog/destination_name_transformer.py     155      8    95%
	 normalization/transform_catalog/reserved_keywords.py                 13      0   100%
	 normalization/transform_catalog/stream_processor.py                 520    333    36%
	 normalization/transform_catalog/table_name_registry.py              174     34    80%
	 normalization/transform_catalog/transform.py                         45     26    42%
	 normalization/transform_catalog/utils.py                             33      7    79%
	 normalization/transform_config/__init__.py                            2      0   100%
	 normalization/transform_config/transform.py                         148     34    77%
	 -------------------------------------------------------------------------------------
	 TOTAL                                                              1258    525    58%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                 Stmts   Miss  Cover
	 ------------------------------------------------------------------------
	 source_acceptance_test/__init__.py                       2      0   100%
	 source_acceptance_test/base.py                          10      4    60%
	 source_acceptance_test/config.py                        74      6    92%
	 source_acceptance_test/conftest.py                     109    109     0%
	 source_acceptance_test/plugin.py                        47     47     0%
	 source_acceptance_test/tests/__init__.py                 4      0   100%
	 source_acceptance_test/tests/test_core.py              242     96    60%
	 source_acceptance_test/tests/test_full_refresh.py       38      0   100%
	 source_acceptance_test/tests/test_incremental.py        69     38    45%
	 source_acceptance_test/utils/__init__.py                 6      0   100%
	 source_acceptance_test/utils/asserts.py                 37      2    95%
	 source_acceptance_test/utils/common.py                  54     17    69%
	 source_acceptance_test/utils/compare.py                 62     23    63%
	 source_acceptance_test/utils/connector_runner.py       110     48    56%
	 source_acceptance_test/utils/json_schema_helper.py     115     14    88%
	 ------------------------------------------------------------------------
	 TOTAL                                                  979    404    59%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                              Stmts   Miss  Cover
	 -------------------------------------------------------------------------------------
	 main_dev_transform_catalog.py                                         3      3     0%
	 main_dev_transform_config.py                                          3      3     0%
	 normalization/__init__.py                                             4      0   100%
	 normalization/destination_type.py                                    13      0   100%
	 normalization/transform_catalog/__init__.py                           2      0   100%
	 normalization/transform_catalog/catalog_processor.py                143     77    46%
	 normalization/transform_catalog/destination_name_transformer.py     155      8    95%
	 normalization/transform_catalog/reserved_keywords.py                 13      0   100%
	 normalization/transform_catalog/stream_processor.py                 520    333    36%
	 normalization/transform_catalog/table_name_registry.py              174     34    80%
	 normalization/transform_catalog/transform.py                         45     26    42%
	 normalization/transform_catalog/utils.py                             33      7    79%
	 normalization/transform_config/__init__.py                            2      0   100%
	 normalization/transform_config/transform.py                         148     34    77%
	 -------------------------------------------------------------------------------------
	 TOTAL                                                              1258    525    58%
	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                              Stmts   Miss  Cover
	 -------------------------------------------------------------------------------------
	 main_dev_transform_catalog.py                                         3      3     0%
	 main_dev_transform_config.py                                          3      3     0%
	 normalization/__init__.py                                             4      0   100%
	 normalization/destination_type.py                                    13      0   100%
	 normalization/transform_catalog/__init__.py                           2      0   100%
	 normalization/transform_catalog/catalog_processor.py                143     12    92%
	 normalization/transform_catalog/destination_name_transformer.py     155      4    97%
	 normalization/transform_catalog/reserved_keywords.py                 13      0   100%
	 normalization/transform_catalog/stream_processor.py                 520     39    92%
	 normalization/transform_catalog/table_name_registry.py              174     51    71%
	 normalization/transform_catalog/transform.py                         45     30    33%
	 normalization/transform_catalog/utils.py                             33      0   100%
	 normalization/transform_config/__init__.py                            2      0   100%
	 normalization/transform_config/transform.py                         148     46    69%
	 -------------------------------------------------------------------------------------
	 TOTAL                                                              1258    188    85%

@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets January 28, 2022 08:14 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 28, 2022 08:15 Inactive
@github-actions github-actions bot added area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker labels Jan 28, 2022
@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets January 28, 2022 09:40 Inactive
@ChristopheDuong
Copy link
Contributor Author

ChristopheDuong commented Jan 28, 2022

/publish connector=bases/base-normalization

🕑 bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1760886056
✅ bases/base-normalization https://github.com/airbytehq/airbyte/actions/runs/1760886056

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 28, 2022 09:53 Inactive
@ChristopheDuong ChristopheDuong temporarily deployed to more-secrets January 28, 2022 11:15 Inactive
@ChristopheDuong ChristopheDuong merged commit 87a3055 into master Jan 28, 2022
@ChristopheDuong ChristopheDuong deleted the chris/normalization-param-tweaks branch January 28, 2022 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker normalization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants