Skip to content

Commit 1a517d2

Browse files
graciegoheenjoellabesJamieRosenberg-canvapatkearns10
authored
Feature/add listagg macro (#530)
* Update README.md * Mutually excl range examples in disclosure triangle * Fix union_relations error when no include/exclude provided * Fix union_relations error when no include/exclude provided (#509) * Update CHANGELOG.md * Add to_condition to relationships where * very minor nit - update "an new" to "a new" (#519) * add quoting to split_part (#528) * add quoting to split_part * update docs for split_part * typo * corrected readme syntax * revert and update to just documentation * add new line * Update README.md * Update README.md * Update README.md Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * add macro to get columns (#516) * add macro to get columns * star macro should use get_columns * add adapter. * swap adapter for dbt_utils Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * update documentation * add output_lower arg * update name to get_filtered_columns_in_relation from get_columns * add tests * forgot args * too much whitespace removal ----------- Actual: ----------- --->"field_3"as "test_field_3"<--- ----------- Expected: ----------- --->"field_3" as "test_field_3"<--- * didnt mean to move a file that i did not create. moving things back. * remove lowercase logic * limit_zero Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * Add listagg macro and integration test * remove type in listagg macro * updated integration test * Add redshift to listagg macro * remove redshift listagg * explicitly named group by column * updated default values * Updated example to use correct double vs. single quotes * whitespace control * Added redshift specific macro * Remove documentation * Update integration test so less likely to accidentally work Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * default everything but measure to none * added limit functionality for other dbs * syntax bug for postgres * update redshift macro * fixed block def control * Fixed bug in redshift * Bug fix redshift * remove unused group_by arg * Added additional test without order by col * updated to regex replace * typo * added more integration_tests * attempt to make redshift less complicated * typo * update redshift * replace to substr * More explicit versions with added complexity * handle special characters Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> Co-authored-by: Jamie Rosenberg <james.rosenberg@canva.com> Co-authored-by: Pat Kearns <pat.kearns@fishtownanalytics.com>
1 parent 31577cb commit 1a517d2

15 files changed

+358
-20
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
12
# dbt-utils v0.8.3
23
## New features
34
- A macro for deduplicating data ([#335](https://github.com/dbt-labs/dbt-utils/issues/335), [#512](https://github.com/dbt-labs/dbt-utils/pull/512))

README.md

+54-7
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this
3030

3131
- [Introspective macros](#introspective-macros):
3232
- [get_column_values](#get_column_values-source)
33+
- [get_filtered_columns_in_relation](#get_filtered_columns_in_relation-source)
3334
- [get_relations_by_pattern](#get_relations_by_pattern-source)
3435
- [get_relations_by_prefix](#get_relations_by_prefix-source)
3536
- [get_query_results_as_dict](#get_query_results_as_dict-source)
@@ -59,6 +60,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this
5960
- [split_part](#split_part-source)
6061
- [last_day](#last_day-source)
6162
- [width_bucket](#width_bucket-source)
63+
- [listagg](#listagg)
6264

6365
- [Jinja Helpers](#jinja-helpers)
6466
- [pretty_time](#pretty_time-source)
@@ -69,11 +71,11 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this
6971
- [insert_by_period](#insert_by_period-source)
7072

7173
----
72-
=======
7374
### Generic Tests
7475
#### equal_rowcount ([source](macros/generic_tests/equal_rowcount.sql))
7576
Asserts that two relations have the same number of rows.
7677

78+
7779
**Usage:**
7880
```yaml
7981
version: 2
@@ -387,7 +389,6 @@ models:
387389
```
388390
<details>
389391
<summary>Additional `gaps` and `zero_length_range_allowed` examples</summary>
390-
391392
**Understanding the `gaps` argument:**
392393

393394
Here are a number of examples for each allowed `gaps` argument.
@@ -435,7 +436,6 @@ models:
435436
| 0 | 1 |
436437
| 2 | 2 |
437438
| 3 | 4 |
438-
439439
</details>
440440

441441
#### sequential_values ([source](macros/generic_tests/sequential_values.sql))
@@ -551,7 +551,7 @@ These macros run a query and return the results of the query as objects. They ar
551551
#### get_column_values ([source](macros/sql/get_column_values.sql))
552552
This macro returns the unique values for a column in a given [relation](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation) as an array.
553553

554-
Arguments:
554+
**Args:**
555555
- `table` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
556556
- `column` (required): The name of the column you wish to find the column values of
557557
- `order_by` (optional, default=`'count(*) desc'`): How the results should be ordered. The default is to order by `count(*) desc`, i.e. decreasing frequency. Setting this as `'my_column'` will sort alphabetically, while `'min(created_at)'` will sort by when thevalue was first observed.
@@ -592,6 +592,28 @@ Arguments:
592592
...
593593
```
594594

595+
#### get_filtered_columns_in_relation ([source](macros/sql/get_filtered_columns_in_relation.sql))
596+
This macro returns an iterable Jinja list of columns for a given [relation](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation), (i.e. not from a CTE)
597+
- optionally exclude columns
598+
- the input values are not case-sensitive (input uppercase or lowercase and it will work!)
599+
> Note: The native [`adapter.get_columns_in_relation` macro](https://docs.getdbt.com/reference/dbt-jinja-functions/adapter#get_columns_in_relation) allows you
600+
to pull column names in a non-filtered fashion, also bringing along with it other (potentially unwanted) information, such as dtype, char_size, numeric_precision, etc.
601+
602+
**Args:**
603+
- `from` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
604+
- `except` (optional, default=`[]`): The name of the columns you wish to exclude. (case-insensitive)
605+
606+
**Usage:**
607+
```sql
608+
-- Returns a list of the columns from a relation, so you can then iterate in a for loop
609+
{% set column_names = dbt_utils.get_filtered_columns_in_relation(from=ref('your_model'), except=["field_1", "field_2"]) %}
610+
...
611+
{% for column_name in column_names %}
612+
max({{ column_name }}) ... as max_'{{ column_name }}',
613+
{% endfor %}
614+
...
615+
```
616+
595617
#### get_relations_by_pattern ([source](macros/sql/get_relations_by_pattern.sql))
596618
Returns a list of [Relations](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation)
597619
that match a given schema- or table-name pattern.
@@ -770,9 +792,20 @@ group by 1,2,3
770792
```
771793

772794
#### star ([source](macros/sql/star.sql))
773-
This macro generates a comma-separated list of all fields that exist in the `from` relation, excluding any fields listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with the star macro. This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`).
795+
This macro generates a comma-separated list of all fields that exist in the `from` relation, excluding any fields
796+
listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with
797+
the star macro.
798+
This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`).
799+
The macro also has optional `prefix` and `suffix` arguments. When one or both are provided, they will be concatenated onto each field's alias
800+
in the output (`prefix` ~ `field_name` ~ `suffix`). NB: This prevents the output from being used in any context other than a select statement.
801+
774802

775-
The macro also has optional `prefix` and `suffix` arguments. When one or both are provided, they will be concatenated onto each field's alias in the output (`prefix` ~ `field_name` ~ `suffix`). NB: This prevents the output from being used in any context other than a select statement.
803+
**Args:**
804+
- `from` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
805+
- `except` (optional, default=`[]`): The name of the columns you wish to exclude. (case-insensitive)
806+
- `relation_alias` (optional, default=`''`): will prefix all generated fields with an alias (`relation_alias`.`field_name`).
807+
- `prefix` (optional, default=`''`): will prefix the output `field_name` (`field_name as prefix_field_name`).
808+
- `suffix` (optional, default=`''`): will suffix the output `field_name` (`field_name as field_name_suffix`).
776809

777810
**Usage:**
778811
```sql
@@ -789,6 +822,13 @@ from {{ ref('my_model') }}
789822

790823
```
791824

825+
```sql
826+
select
827+
{{ dbt_utils.star(from=ref('my_model'), except=["exclude_field_1", "exclude_field_2"], prefix="max_") }}
828+
from {{ ref('my_model') }}
829+
830+
```
831+
792832
#### union_relations ([source](macros/sql/union.sql))
793833

794834
This macro unions together an array of [Relations](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation),
@@ -987,9 +1027,16 @@ This macro calculates the difference between two dates.
9871027
#### split_part ([source](macros/cross_db_utils/split_part.sql))
9881028
This macro splits a string of text using the supplied delimiter and returns the supplied part number (1-indexed).
9891029

1030+
**Args**:
1031+
- `string_text` (required): Text to be split into parts.
1032+
- `delimiter_text` (required): Text representing the delimiter to split by.
1033+
- `part_number` (required): Requested part of the split (1-based). If the value is negative, the parts are counted backward from the end of the string.
1034+
9901035
**Usage:**
1036+
When referencing a column, use one pair of quotes. When referencing a string, use single quotes enclosed in double quotes.
9911037
```
992-
{{ dbt_utils.split_part(string_text='1,2,3', delimiter_text=',', part_number=1) }}
1038+
{{ dbt_utils.split_part(string_text='column_to_split', delimiter_text='delimiter_column', part_number=1) }}
1039+
{{ dbt_utils.split_part(string_text="'1|2|3'", delimiter_text="'|'", part_number=1) }}
9931040
```
9941041

9951042
#### date_trunc ([source](macros/cross_db_utils/date_trunc.sql))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
group_col,string_text,order_col
2+
1,a,1
3+
1,b,2
4+
1,c,3
5+
2,a,2
6+
2,1,1
7+
2,p,3
8+
3,g,1
9+
3,g,2
10+
3,g,3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
group_col,expected,version
2+
1,"a_|_b_|_c",bottom_ordered
3+
2,"1_|_a_|_p",bottom_ordered
4+
3,"g_|_g_|_g",bottom_ordered
5+
1,"a_|_b",bottom_ordered_limited
6+
2,"1_|_a",bottom_ordered_limited
7+
3,"g_|_g",bottom_ordered_limited
8+
3,"g, g, g",comma_whitespace_unordered
9+
3,"g",distinct_comma
10+
3,"g,g,g",no_params
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
field_1,field_2,field_3
2+
a,b,c
3+
d,e,f
4+
g,h,i
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
field_2,field_3
2+
h,i
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
{% macro assert_equal_values(actual_object, expected_object) %}
2+
{% if not execute %}
3+
4+
{# pass #}
5+
6+
{% elif actual_object != expected_object %}
7+
8+
{% set msg %}
9+
Expected did not match actual
10+
11+
-----------
12+
Actual:
13+
-----------
14+
--->{{ actual_object }}<---
15+
16+
-----------
17+
Expected:
18+
-----------
19+
--->{{ expected_object }}<---
20+
21+
{% endset %}
22+
23+
{{ log(msg, info=True) }}
24+
25+
select 'fail'
26+
27+
{% else %}
28+
29+
select 'ok' {{ limit_zero() }}
30+
31+
{% endif %}
32+
{% endmacro %}

integration_tests/models/cross_db_utils/schema.yml

+6
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,12 @@ models:
5858
- assert_equal:
5959
actual: actual
6060
expected: expected
61+
62+
- name: test_listagg
63+
tests:
64+
- assert_equal:
65+
actual: actual
66+
expected: expected
6167

6268
- name: test_safe_cast
6369
tests:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
with data as (
2+
3+
select * from {{ ref('data_listagg') }}
4+
5+
),
6+
7+
data_output as (
8+
9+
select * from {{ ref('data_listagg_output') }}
10+
11+
),
12+
13+
calculate as (
14+
15+
select
16+
group_col,
17+
{{ dbt_utils.listagg('string_text', "'_|_'", "order by order_col") }} as actual,
18+
'bottom_ordered' as version
19+
from data
20+
group by group_col
21+
22+
union all
23+
24+
select
25+
group_col,
26+
{{ dbt_utils.listagg('string_text', "'_|_'", "order by order_col", 2) }} as actual,
27+
'bottom_ordered_limited' as version
28+
from data
29+
group by group_col
30+
31+
union all
32+
33+
select
34+
group_col,
35+
{{ dbt_utils.listagg('string_text', "', '") }} as actual,
36+
'comma_whitespace_unordered' as version
37+
from data
38+
where group_col = 3
39+
group by group_col
40+
41+
union all
42+
43+
select
44+
group_col,
45+
{{ dbt_utils.listagg('DISTINCT string_text', "','") }} as actual,
46+
'distinct_comma' as version
47+
from data
48+
where group_col = 3
49+
group by group_col
50+
51+
union all
52+
53+
select
54+
group_col,
55+
{{ dbt_utils.listagg('string_text') }} as actual,
56+
'no_params' as version
57+
from data
58+
where group_col = 3
59+
group by group_col
60+
61+
)
62+
63+
select
64+
calculate.actual,
65+
data_output.expected
66+
from calculate
67+
left join data_output
68+
on calculate.group_col = data_output.group_col
69+
and calculate.version = data_output.version

integration_tests/models/sql/schema.yml

+10
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,11 @@ models:
5050
values:
5151
- '5'
5252

53+
- name: test_get_filtered_columns_in_relation
54+
tests:
55+
- dbt_utils.equality:
56+
compare_model: ref('data_filtered_columns_in_relation_expected')
57+
5358
- name: test_get_relations_by_prefix_and_union
5459
columns:
5560
- name: event
@@ -121,6 +126,11 @@ models:
121126
- dbt_utils.equality:
122127
compare_model: ref('data_star_aggregate_expected')
123128

129+
- name: test_star_uppercase
130+
tests:
131+
- dbt_utils.equality:
132+
compare_model: ref('data_star_expected')
133+
124134
- name: test_surrogate_key
125135
tests:
126136
- assert_equal:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{% set exclude_field = 'field_1' %}
2+
{% set column_names = dbt_utils.get_filtered_columns_in_relation(from= ref('data_filtered_columns_in_relation'), except=[exclude_field]) %}
3+
4+
with data as (
5+
6+
select
7+
8+
{% for column_name in column_names %}
9+
max({{ column_name }}) as {{ column_name }} {% if not loop.last %},{% endif %}
10+
{% endfor %}
11+
12+
from {{ ref('data_filtered_columns_in_relation') }}
13+
14+
)
15+
16+
select * from data
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{% set exclude_field = 'FIELD_3' %}
2+
3+
4+
with data as (
5+
6+
select
7+
{{ dbt_utils.star(from=ref('data_star'), except=[exclude_field]) }}
8+
9+
from {{ ref('data_star') }}
10+
11+
)
12+
13+
select * from data

0 commit comments

Comments
 (0)