Skip to content

Commit 32a2409

Browse files
committed
Use built-in adapter functionality for datatypes
1 parent 2dfdb3c commit 32a2409

File tree

1 file changed

+81
-39
lines changed

1 file changed

+81
-39
lines changed

macros/cross_db_utils/datatypes.sql

+81-39
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,37 @@
1+
/*
2+
One macro to rule them all
3+
4+
Provides a nice interface into api.Column.translate_type
5+
https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/adapters/base/column.py#L10-L24
6+
7+
Each adapter plugin can inherit/override TYPE_LABELS in its Column subclass
8+
E.g. BigQuery: https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/column.py#L12-L19
9+
10+
Maybe this is the one we push into dbt-core, and leave the others in dbt-utils?
11+
12+
Downside: harder to tell what the valid options are, since the type is passed as an argument
13+
instead of being part of the macro name. We could add validation, but the default behavior
14+
feels better: just return the string passed in if there's no known translation.
15+
*/
16+
17+
{%- macro get_data_type(dtype) -%}
18+
{# if there is no translation for 'dtype', it just returns 'dtype' #}
19+
{{ return(api.Column.translate_type(dtype)) }}
20+
{%- endmacro -%}
21+
122
{# string ------------------------------------------------- #}
223

324
{%- macro type_string() -%}
425
{{ return(adapter.dispatch('type_string', 'dbt_utils')()) }}
526
{%- endmacro -%}
627

728
{% macro default__type_string() %}
8-
string
9-
{% endmacro %}
10-
11-
{%- macro redshift__type_string() -%}
12-
varchar
13-
{%- endmacro -%}
14-
15-
{% macro postgres__type_string() %}
16-
varchar
17-
{% endmacro %}
18-
19-
{% macro snowflake__type_string() %}
20-
varchar
29+
{{ return(dbt_utils.get_data_type("string")) }}
2130
{% endmacro %}
2231

32+
-- This will return 'text' by default
33+
-- On Postgres + Snowflake, that's equivalent to varchar (no size)
34+
-- Redshift will treat that as varchar(256)
2335

2436

2537
{# timestamp ------------------------------------------------- #}
@@ -29,16 +41,31 @@
2941
{%- endmacro -%}
3042

3143
{% macro default__type_timestamp() %}
32-
timestamp
44+
{{ return(dbt_utils.get_data_type("timestamp")) }}
3345
{% endmacro %}
3446

35-
{% macro postgres__type_timestamp() %}
36-
timestamp without time zone
37-
{% endmacro %}
47+
/*
48+
POSTGRES
49+
https://www.postgresql.org/docs/current/datatype-datetime.html:
50+
The SQL standard requires that writing just `timestamp`
51+
be equivalent to `timestamp without time zone`, and
52+
PostgreSQL honors that behavior.
53+
`timestamptz` is accepted as an abbreviation for `timestamp with time zone`;
54+
this is a PostgreSQL extension.
3855
39-
{% macro snowflake__type_timestamp() %}
40-
timestamp_ntz
41-
{% endmacro %}
56+
SNOWFLAKE
57+
https://docs.snowflake.com/en/sql-reference/data-types-datetime.html#timestamp
58+
The TIMESTAMP_* variation associated with TIMESTAMP is specified by the
59+
TIMESTAMP_TYPE_MAPPING session parameter. The default is TIMESTAMP_NTZ.
60+
61+
BIGQUERY
62+
'timestamp' means 'timestamp with time zone'
63+
'datetime' means 'timestamp without time zone'
64+
65+
/* Snowflake:
66+
https://docs.snowflake.com/en/sql-reference/data-types-datetime.html#timestamp
67+
The TIMESTAMP_* variation associated with TIMESTAMP is specified by the TIMESTAMP_TYPE_MAPPING session parameter. The default is TIMESTAMP_NTZ.
68+
*/
4269

4370

4471
{# float ------------------------------------------------- #}
@@ -48,11 +75,7 @@
4875
{%- endmacro -%}
4976

5077
{% macro default__type_float() %}
51-
float
52-
{% endmacro %}
53-
54-
{% macro bigquery__type_float() %}
55-
float64
78+
{{ return(dbt_utils.get_data_type("float")) }}
5679
{% endmacro %}
5780

5881
{# numeric ------------------------------------------------ #}
@@ -61,12 +84,35 @@
6184
{{ return(adapter.dispatch('type_numeric', 'dbt_utils')()) }}
6285
{%- endmacro -%}
6386

87+
/*
88+
This one can't be just translate_type, since precision/scale make it a bit more complicated.
89+
90+
On most databases, the default (precision, scale) is something like:
91+
Redshift: (18, 0)
92+
Snowflake: (38, 0)
93+
Postgres: (<=131072, 0)
94+
95+
https://www.postgresql.org/docs/current/datatype-numeric.html:
96+
Specifying NUMERIC without any precision or scale creates an “unconstrained numeric”
97+
column in which numeric values of any length can be stored, up to the implementation limits.
98+
A column of this kind will not coerce input values to any particular scale,
99+
whereas numeric columns with a declared scale will coerce input values to that scale.
100+
(The SQL standard requires a default scale of 0, i.e., coercion to integer precision.
101+
We find this a bit useless. If you're concerned about portability, always specify
102+
the precision and scale explicitly.)
103+
*/
104+
64105
{% macro default__type_numeric() %}
65-
numeric(28, 6)
106+
{{ return(api.Column.numeric_type("numeric", 28, 6)) }}
66107
{% endmacro %}
67108

109+
-- BigQuery default scale for 'numeric' is numeric(38, 9)
110+
-- and it really doesn't like parametrized types
111+
-- Should we override 'numeric_type' for dbt-bigquery to avoid returning parametrized types?
112+
-- https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/column.py
113+
68114
{% macro bigquery__type_numeric() %}
69-
numeric
115+
{{ return(api.Column.numeric_type("numeric", None, None)) }}
70116
{% endmacro %}
71117

72118

@@ -76,24 +122,20 @@
76122
{{ return(adapter.dispatch('type_bigint', 'dbt_utils')()) }}
77123
{%- endmacro -%}
78124

125+
-- We don't have a conversion type for 'bigint' in TYPE_LABELS,
126+
-- so this actually just returns the string 'bigint'
127+
79128
{% macro default__type_bigint() %}
80-
bigint
129+
{{ return(dbt_utils.get_data_type("bigint")) }}
81130
{% endmacro %}
82131

83-
{% macro bigquery__type_bigint() %}
84-
int64
85-
{% endmacro %}
132+
-- Good news: BigQuery now supports 'bigint' (and 'int') as an alias for 'int64'
86133

87134
{# int ------------------------------------------------- #}
88135

89136
{%- macro type_int() -%}
90-
{{ return(adapter.dispatch('type_int', 'dbt_utils')()) }}
137+
{{ return(dbt_utils.get_data_type("integer")) }}
91138
{%- endmacro -%}
92139

93-
{% macro default__type_int() %}
94-
int
95-
{% endmacro %}
96-
97-
{% macro bigquery__type_int() %}
98-
int64
99-
{% endmacro %}
140+
-- returns 'int' everywhere, except BigQuery, where it returns 'int64'
141+
-- (but BigQuery also now accepts 'int' as a valid alias for 'int64')

0 commit comments

Comments
 (0)