Skip to content

Commit aaccb78

Browse files
bastienboutonnetclrcrl
authored andcommitted
Support kms in haversine_distance macro (#340)
1 parent 4b9328a commit aaccb78

10 files changed

+136
-22
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
* Support a new argument, `zero_length_range_allowed` in the `mutually_exclusive_ranges` test ([#307](https://github.com/fishtown-analytics/dbt-utils/pull/307) [@zemekeng](https://github.com/zemekeneng))
77
* Add new schema test, `sequential_values` ([#318](https://github.com/fishtown-analytics/dbt-utils/pull/318), inspired by [@hundredwatt](https://github.com/hundredwatt))
88
* Support `quarter` in the `postgres__last_day` macro ([#333](https://github.com/fishtown-analytics/dbt-utils/pull/333/files), [@seunghanhong](https://github.com/seunghanhong))
9+
* Add new argument, `unit`, to `haversine_distance` [#340](https://github.com/fishtown-analytics/dbt-utils/pull/340) [@bastienboutonnet](https://github.com/bastienboutonnet)
910

1011

1112
## Fixes

README.md

+16-13
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Usage:
9595
---
9696
### Date/Time
9797
#### date_spine ([source](macros/datetime/date_spine.sql))
98-
This macro returns the sql required to build a date spine. The spine will include the `start_date` (if it is aligned to the `datepart`), but it will not include the `end_date`.
98+
This macro returns the sql required to build a date spine. The spine will include the `start_date` (if it is aligned to the `datepart`), but it will not include the `end_date`.
9999

100100
Usage:
101101
```
@@ -111,9 +111,12 @@ Usage:
111111
#### haversine_distance ([source](macros/geo/haversine_distance.sql))
112112
This macro calculates the [haversine distance](http://daynebatten.com/2015/09/latitude-longitude-distance-sql/) between a pair of x/y coordinates.
113113

114-
Usage:
114+
Optionally takes a `unit` string parameter ('km' or 'mi') which defaults to miles (imperial system).
115+
116+
**Usage:**
117+
115118
```
116-
{{ dbt_utils.haversine_distance(lat1=<float>,lon1=<float>,lat2=<float>,lon2=<float>) }}
119+
{{ dbt_utils.haversine_distance(lat1=<float>,lon1=<float>,lat2=<float>,lon2=<float>, unit='mi'<string>) }}
117120
```
118121
---
119122
### Schema Tests
@@ -181,13 +184,13 @@ models:
181184

182185
```
183186

184-
This macro can also be used at the column level. When this is done, the `expression` is evaluated against the column.
187+
This macro can also be used at the column level. When this is done, the `expression` is evaluated against the column.
185188

186189
```yaml
187190
version: 2
188-
models:
191+
models:
189192
- name: model_name
190-
columns:
193+
columns:
191194
- name: col_a
192195
tests:
193196
- dbt_utils.expression_is_true:
@@ -197,7 +200,7 @@ models:
197200
- dbt_utils.expression_is_true:
198201
expression: '= 1'
199202
condition: col_a = 1
200-
203+
201204
```
202205

203206

@@ -361,7 +364,7 @@ models:
361364
upper_bound_column: ended_at
362365
partition_by: customer_id
363366
gaps: required
364-
367+
365368
# test that each customer can have subscriptions that start and end on the same date
366369
- name: subscriptions
367370
tests:
@@ -512,9 +515,9 @@ An optional `quote_columns` parameter (`default=false`) can also be used if a co
512515

513516

514517
#### accepted_range ([source](macros/schema_tests/accepted_range.sql))
515-
This test checks that a column's values fall inside an expected range. Any combination of `min_value` and `max_value` is allowed, and the range can be inclusive or exclusive. Provide a `where` argument to filter to specific records only.
518+
This test checks that a column's values fall inside an expected range. Any combination of `min_value` and `max_value` is allowed, and the range can be inclusive or exclusive. Provide a `where` argument to filter to specific records only.
516519

517-
In addition to comparisons to a scalar value, you can also compare to another column's values. Any data type that supports the `>` or `<` operators can be compared, so you could also run tests like checking that all order dates are in the past.
520+
In addition to comparisons to a scalar value, you can also compare to another column's values. Any data type that supports the `>` or `<` operators can be compared, so you could also run tests like checking that all order dates are in the past.
518521

519522
Usage:
520523
```yaml
@@ -528,19 +531,19 @@ models:
528531
- dbt_utils.accepted_range:
529532
min_value: 0
530533
inclusive: false
531-
534+
532535
- name: account_created_at
533536
tests:
534537
- dbt_utils.accepted_range:
535538
max_value: "getdate()"
536539
#inclusive is true by default
537-
540+
538541
- name: num_returned_orders
539542
tests:
540543
- dbt_utils.accepted_range:
541544
min_value: 0
542545
max_value: "num_orders"
543-
546+
544547
- name: num_web_sessions
545548
tests:
546549
- dbt_utils.accepted_range:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
lat_1,lon_1,lat_2,lon_2,output
2+
48.864716,2.349014,52.379189,4.899431,430
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
lat_1,lon_1,lat_2,lon_2,output
2+
48.864716,2.349014,52.379189,4.899431,267

integration_tests/dbt_project.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,8 @@ seeds:
5353
sql:
5454
data_events_20180103:
5555
+schema: events
56-
56+
5757
schema_tests:
5858
data_test_sequential_timestamps:
5959
+column_types:
60-
my_timestamp: timestamp
60+
my_timestamp: timestamp

integration_tests/macros/tests.sql

-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11

22
{% macro test_assert_equal(model, actual, expected) %}
3-
43
select count(*) from {{ model }} where {{ actual }} != {{ expected }}
54

65
{% endmacro %}
+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
version: 2
2+
3+
models:
4+
- name: test_haversine_distance_km
5+
tests:
6+
- assert_equal:
7+
actual: actual
8+
expected: expected
9+
- name: test_haversine_distance_mi
10+
tests:
11+
- assert_equal:
12+
actual: actual
13+
expected: expected
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
with data as (
2+
select * from {{ ref('data_haversine_km') }}
3+
),
4+
final as (
5+
select
6+
output as expected,
7+
cast(
8+
{{
9+
dbt_utils.haversine_distance(
10+
lat1='lat_1',
11+
lon1='lon_1',
12+
lat2='lat_2',
13+
lon2='lon_2',
14+
unit='km'
15+
)
16+
}} as numeric
17+
) as actual
18+
from data
19+
)
20+
select
21+
expected,
22+
round(actual,0) as actual
23+
from final
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
with data as (
2+
select * from {{ ref('data_haversine_mi') }}
3+
),
4+
final as (
5+
select
6+
output as expected,
7+
cast(
8+
{{
9+
dbt_utils.haversine_distance(
10+
lat1='lat_1',
11+
lon1='lon_1',
12+
lat2='lat_2',
13+
lon2='lon_2',
14+
unit='mi'
15+
)
16+
}} as numeric
17+
) as actual
18+
from data
19+
20+
union all
21+
22+
select
23+
output as expected,
24+
cast(
25+
{{
26+
dbt_utils.haversine_distance(
27+
lat1='lat_1',
28+
lon1='lon_1',
29+
lat2='lat_2',
30+
lon2='lon_2',
31+
)
32+
}} as numeric
33+
) as actual
34+
from data
35+
)
36+
select
37+
expected,
38+
round(actual,0) as actual
39+
from final

macros/geo/haversine_distance.sql

+38-6
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,49 @@ This calculates the distance between two sets of latitude and longitude.
33
The formula is from the following blog post:
44
http://daynebatten.com/2015/09/latitude-longitude-distance-sql/
55

6-
The arguments should be float type.
6+
The arguments should be float type.
77
#}
88

9-
{% macro haversine_distance(lat1,lon1,lat2,lon2) -%}
10-
{{ return(adapter.dispatch('haversine_distance', packages = dbt_utils._get_utils_namespaces())(lat1,lon1,lat2,lon2)) }}
9+
{% macro degrees_to_radians(degrees) -%}
10+
acos(-1) * {{degrees}} / 180
11+
{%- endmacro %}
12+
13+
{% macro haversine_distance(lat1, lon1, lat2, lon2, unit='mi') -%}
14+
{{ return(adapter.dispatch('haversine_distance', packages = dbt_utils._get_utils_namespaces())(lat1,lon1,lat2,lon2,unit)) }}
1115
{% endmacro %}
1216

13-
{% macro default__haversine_distance(lat1,lon1,lat2,lon2) -%}
17+
{% macro default__haversine_distance(lat1, lon1, lat2, lon2, unit='mi') -%}
18+
{%- if unit == 'mi' %}
19+
{% set conversion_rate = 1 %}
20+
{% elif unit == 'km' %}
21+
{% set conversion_rate = 1.60934 %}
22+
{% else %}
23+
{{ exceptions.raise_compiler_error("unit input must be one of 'mi' or 'km'. Got " ~ unit) }}
24+
{% endif %}
1425

15-
2 * 3961 * asin(sqrt((sin(radians(({{lat2}} - {{lat1}}) / 2))) ^ 2 +
26+
2 * 3961 * asin(sqrt(pow((sin(radians(({{ lat2 }} - {{ lat1 }}) / 2))), 2) +
1627
cos(radians({{lat1}})) * cos(radians({{lat2}})) *
17-
(sin(radians(({{lon2}} - {{lon1}}) / 2))) ^ 2))
28+
pow((sin(radians(({{ lon2 }} - {{ lon1 }}) / 2))), 2))) * {{ conversion_rate }}
29+
30+
{%- endmacro %}
31+
32+
33+
34+
{% macro bigquery__haversine_distance(lat1, lon1, lat2, lon2, unit='mi') -%}
35+
{% set radians_lat1 = dbt_utils.degrees_to_radians(lat1) %}
36+
{% set radians_lat2 = dbt_utils.degrees_to_radians(lat2) %}
37+
{% set radians_lon1 = dbt_utils.degrees_to_radians(lon1) %}
38+
{% set radians_lon2 = dbt_utils.degrees_to_radians(lon2) %}
39+
{%- if unit == 'mi' %}
40+
{% set conversion_rate = 1 %}
41+
{% elif unit == 'km' %}
42+
{% set conversion_rate = 1.60934 %}
43+
{% else %}
44+
{{ exceptions.raise_compiler_error("unit input must be one of 'mi' or 'km'. Got " ~ unit) }}
45+
{% endif %}
46+
2 * 3961 * asin(sqrt(pow(sin(({{ radians_lat2 }} - {{ radians_lat1 }}) / 2), 2) +
47+
cos({{ radians_lat1 }}) * cos({{ radians_lat2 }}) *
48+
pow(sin(({{ radians_lon2 }} - {{ radians_lon1 }}) / 2), 2))) * {{ conversion_rate }}
1849

1950
{%- endmacro %}
51+

0 commit comments

Comments
 (0)