Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dbt clickhouse #14833

Closed
wants to merge 15 commits into from
Closed

Conversation

guykoh
Copy link
Contributor

@guykoh guykoh commented Jul 19, 2022

What

Update dbt-clickhouse version to support Airbyte on ClickHouse cloud

How

Version 1.1.7 supports CREATE TABLE AS SELECT in ReplicatedMergeTree database

🚨 User Impact 🚨

🚨🚨 Enable Airbyte integration with ClickHouse cloud

Pre-merge Checklist

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

@guykoh guykoh requested a review from a team as a code owner July 19, 2022 13:07
@CLAassistant
Copy link

CLAassistant commented Jul 19, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added area/documentation Improvements or additions to documentation area/platform issues related to the platform area/worker Related to worker labels Jul 19, 2022
@harshithmullapudi
Copy link
Contributor

Hey @guykoh could you resolve the conflicts

guykoh added 3 commits July 20, 2022 09:05
…e_dbt_clickhouse

� Conflicts:
�	airbyte-integrations/bases/base-normalization/Dockerfile
�	docs/understanding-airbyte/basic-normalization.md
@github-actions github-actions bot removed area/worker Related to worker area/platform issues related to the platform labels Jul 20, 2022
@github-actions github-actions bot added area/platform issues related to the platform area/worker Related to worker labels Jul 20, 2022
@guykoh
Copy link
Contributor Author

guykoh commented Jul 20, 2022

@harshithmullapudi, I resolved the conflicts

@grishick grishick temporarily deployed to more-secrets July 20, 2022 19:43 Inactive
@grishick
Copy link
Contributor

I tried running normalization tests (this PR) and got this error:

1 check failed:
	 dbt was unable to connect to the specified database.
	 The database returned the following error:
	   >Database Error
	   Library for ClickHouse driver type None not found

@guykoh
Copy link
Contributor Author

guykoh commented Jul 21, 2022

Hi @grishick, I left you a comment in your pr - you don't need the driver field, the driver will be derived from the port (9440 or 9000 for 'native' and 8123 or 8443 for 'http'). Anyway, 'clickhouse-connect' is invalid value use 'native' or 'http' instead (in your case use 'native')

@grishick
Copy link
Contributor

grishick commented Jul 21, 2022

@guykoh thanks for the pointer. I added the suggested change to my PR and this problem got resolved. Now there is one test that is still failing with the following errors. Looks like special characters in the SQL query may be tripping up Clickhouse in this test:

	 22:00:21  25 of 35 START table model ***_ci_ebzlq.exchange_rate ................................................... [RUN]
	 22:00:21  25 of 35 ERROR creating table model ***_ci_ebzlq.exchange_rate .......................................... [ERROR in 0.04s]
	 22:00:21  26 of 35 START incremental model ***_ci_ebzlq.multiple_column_names_conflicts_scd ....................... [RUN]
	 22:00:21  26 of 35 OK created incremental model ***_ci_ebzlq.multiple_column_names_conflicts_scd .................. [OK in 0.07s]
	 22:00:21  27 of 35 START incremental model ***_ci_ebzlq.pos_dedup_cdcx_scd ........................................ [RUN]
	 22:00:21  27 of 35 OK created incremental model ***_ci_ebzlq.pos_dedup_cdcx_scd ................................... [OK in 0.07s]
	 22:00:21  28 of 35 START incremental model ***_ci_ebzlq.renamed_dedup_cdc_excluded_scd ............................ [RUN]
	 22:00:21  28 of 35 OK created incremental model ***_ci_ebzlq.renamed_dedup_cdc_excluded_scd ....................... [OK in 0.07s]
	 22:00:21  29 of 35 START incremental model ***_ci_ebzlq.1_prefix_startwith_number ................................. [RUN]
	 22:00:21  29 of 35 OK created incremental model ***_ci_ebzlq.1_prefix_startwith_number ............................ [OK in 0.03s]
	 22:00:21  30 of 35 START incremental model ***_ci_ebzlq.dedup_cdc_excluded ........................................ [RUN]
	 22:00:21  30 of 35 OK created incremental model ***_ci_ebzlq.dedup_cdc_excluded ................................... [OK in 0.04s]
	 22:00:21  31 of 35 START incremental model ***_ci_ebzlq.dedup_exchange_rate ....................................... [RUN]
	 22:00:21  31 of 35 OK created incremental model ***_ci_ebzlq.dedup_exchange_rate .................................. [OK in 0.04s]
	 22:00:21  32 of 35 START incremental model ***_ci_ebzlq.multiple_column_names_conflicts ........................... [RUN]
	 22:00:21  32 of 35 OK created incremental model ***_ci_ebzlq.multiple_column_names_conflicts ...................... [OK in 0.04s]
	 22:00:21  33 of 35 START incremental model ***_ci_ebzlq.pos_dedup_cdcx ............................................ [RUN]
	 22:00:21  33 of 35 OK created incremental model ***_ci_ebzlq.pos_dedup_cdcx ....................................... [OK in 0.03s]
	 22:00:21  34 of 35 START incremental model ***_ci_ebzlq.renamed_dedup_cdc_excluded ................................ [RUN]
	 22:00:21  34 of 35 OK created incremental model ***_ci_ebzlq.renamed_dedup_cdc_excluded ........................... [OK in 0.03s]
	 22:00:21  35 of 35 SKIP relation ***_ci_ebzlq.simple_streams_first_run_row_counts ................................. [SKIP]
	 22:00:21  
	 22:00:21  Finished running 21 view models, 12 incremental models, 2 table models in 1.55s.
	 22:00:21  
	 22:00:21  Completed with 1 error and 0 warnings:
	 22:00:21  
	 22:00:21  Database Error in model exchange_rate (models/generated/airbyte_tables/***_ci_ebzlq/exchange_rate.sql)
	 22:00:21    Code: 62.
	 22:00:21    DB::Exception: Syntax error: failed at position 232 ('(') (line 5, col 72): ("id", "currency", "date", "timestamp_col", "HKD@spéçiäl & characters", "HKD_special___characters", "NZD", "USD", "column`_'with"_quotes", "_airbyte_ab_id", . Unmatched parentheses: (. Stack trace:
	 22:00:21    
	 22:00:21    0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xba37dda in /usr/bin/clickhouse
	 22:00:21    1. DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) @ 0x181a1a7f in /usr/bin/clickhouse
	 22:00:21    2. ? @ 0x16ecb3bf in /usr/bin/clickhouse
	 22:00:21    3. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x16ecb015 in /usr/bin/clickhouse
	 22:00:21    4. DB::TCPHandler::runImpl() @ 0x17b4041c in /usr/bin/clickhouse
	 22:00:21    5. DB::TCPHandler::run() @ 0x17b53499 in /usr/bin/clickhouse
	 22:00:21    6. Poco::Net::TCPServerConnection::start() @ 0x1a98c273 in /usr/bin/clickhouse
	 22:00:21    7. Poco::Net::TCPServerDispatcher::run() @ 0x1a98d66d in /usr/bin/clickhouse
	 22:00:21    8. Poco::PooledThread::run() @ 0x1ab492fd in /usr/bin/clickhouse
	 22:00:21    9. Poco::ThreadImpl::runnableEntry(void*) @ 0x1ab46942 in /usr/bin/clickhouse
	 22:00:21    10. ? @ 0x7f6d1e008609 in ?
	 22:00:21    11. clone @ 0x7f6d1df2d133 in ?
	 22:00:21    compiled SQL at ../build/run/airbyte_utils/models/generated/airbyte_tables/***_ci_ebzlq/exchange_rate.sql
	 22:00:21  
	 22:00:21  Done. PASS=33 WARN=0 ERROR=1 SKIP=1 TOTAL=35
	 docker run --rm --init -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/clickhouse/test_simple_streams:/workspace -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/clickhouse/test_simple_streams/build:/build -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/clickhouse/test_simple_streams/logs:/logs -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/clickhouse/test_simple_streams/build/dbt_packages:/dbt --network host --entrypoint /usr/local/bin/dbt -i airbyte/normalization-clickhouse:dev --event-buffer-size=10000 run --profiles-dir=/workspace --project-dir=/workspace --full-refresh
	 	terminated with return code 1 with 1 'Error/Warning/Fail' mention(s).
FAILED

@github-actions github-actions bot removed area/platform issues related to the platform area/worker Related to worker labels Jul 25, 2022
@grishick
Copy link
Contributor

I updated my PR with changes from this PR and changes from @mzitnik. However, the special characters test is still failing:

	 21:38:33    DB::Exception: Syntax error: failed at position 232 ('(') (line 5, col 72): ("id", "currency", "date", "timestamp_col", "HKD@spéçiäl & characters", "HKD_special___characters", "NZD", "USD", "column`_'with"_quotes", "_airbyte_ab_id", . Unmatched parentheses: (. Stack trace:
	 21:38:33    
	 21:38:33    0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xba37dda in /usr/bin/clickhouse
	 21:38:33    1. DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) @ 0x181a1a7f in /usr/bin/clickhouse
	 21:38:33    2. ? @ 0x16ecb3bf in /usr/bin/clickhouse
	 21:38:33    3. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x16ecb015 in /usr/bin/clickhouse
	 21:38:33    4. DB::TCPHandler::runImpl() @ 0x17b4041c in /usr/bin/clickhouse
	 21:38:33    5. DB::TCPHandler::run() @ 0x17b53499 in /usr/bin/clickhouse
	 21:38:33    6. Poco::Net::TCPServerConnection::start() @ 0x1a98c273 in /usr/bin/clickhouse
	 21:38:33    7. Poco::Net::TCPServerDispatcher::run() @ 0x1a98d66d in /usr/bin/clickhouse
	 21:38:33    8. Poco::PooledThread::run() @ 0x1ab492fd in /usr/bin/clickhouse
	 21:38:33    9. Poco::ThreadImpl::runnableEntry(void*) @ 0x1ab46942 in /usr/bin/clickhouse
	 21:38:33    10. ? @ 0x7f3297c06609 in ?
	 21:38:33    11. clone @ 0x7f3297b2b133 in ?
	 21:38:33    compiled SQL at ../build/run/airbyte_utils/models/generated/airbyte_tables/***_ci_sglcn/exchange_rate.sql

@harshithmullapudi
Copy link
Contributor

Hey @guykoh any update?

@guykoh guykoh closed this Aug 8, 2022
@guykoh
Copy link
Contributor Author

guykoh commented Aug 8, 2022

I closed this pr, since @grishick include all of this changes in his pr and added more fixes. I commented on @grishick pr, the identifiers should be quoted as well, not just column names, including identifiers without special characters but with spaces.

At least this is what I see when I run the tests locally

@harshithmullapudi harshithmullapudi removed their assignment Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation community normalization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants