Skip to content

Commit 8035f87

Browse files
gleb-lobovcksnp
andauthored
Add data products for snowplow-cli (#1101)
* Add snowplow-cli documentation for data products * Add data products cli instructions * Apply suggestions from code review Co-authored-by: Costas Kotsokalis <55377146+cksnp@users.noreply.github.com> * One more missed spec -> specification * Remove github annnotate on publish * Even more renames * Amend to new sidebar * Add the manage data products in CLI section * Fix typo --------- Co-authored-by: Costas Kotsokalis <55377146+cksnp@users.noreply.github.com>
1 parent 8e8b7b5 commit 8035f87

File tree

9 files changed

+933
-75
lines changed

9 files changed

+933
-75
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
title: "Managing data products via the CLI"
3+
description: "Use the 'snowplow-cli data-products' command to manage your data products."
4+
sidebar_label: "Using the CLI"
5+
sidebar_position: 999
6+
---
7+
```mdx-code-block
8+
import Tabs from '@theme/Tabs';
9+
import TabItem from '@theme/TabItem';
10+
```
11+
The `data-products` subcommand of [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md) provides a collection of functionality to ease the integration of custom development and publishing workflows.
12+
## Snowplow CLI Prerequisites
13+
Installed and configured [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md)
14+
## Available commands
15+
### Creating data product
16+
```bash
17+
./snowplow-cli dp generate --data-product my-data-product
18+
```
19+
This command creates a minimal data product template in a new file `./data-products/my-data-product.yaml`.
20+
### Creating source application
21+
```bash
22+
./snowplow-cli dp generate --source-app my-source-app
23+
```
24+
This command creates a minimal source application template in a new file `./data-products/source-apps/my-source-app.yaml`.
25+
### Creating event specification
26+
To create an event specification, you need to modify the existing data-product file and add an event specification object. Here's a minimal example:
27+
```yaml title="./data-products/test-cli.yaml"
28+
apiVersion: v1
29+
resourceType: data-product
30+
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0
31+
data:
32+
name: test-cli
33+
sourceApplications: []
34+
eventSpecifications:
35+
- resourceName: 11d881cd-316e-4286-b5d4-fe7aebf56fca
36+
name: test
37+
event:
38+
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0
39+
```
40+
:::caution Warning
41+
The `source` fields of events and entities must refer to a deployed data structure. Referring to a locally created data structure is not yet supported.
42+
:::
43+
### Linking data product to a source application
44+
To link a data product to a source application, provide a list of references to the source application files in the `data.sourceApplications` field. Here's an example:
45+
```yaml title="./data-products/test-cli.yaml"
46+
apiVersion: v1
47+
resourceType: data-product
48+
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0
49+
data:
50+
name: test-cli
51+
sourceApplications:
52+
- $ref: ./source-apps/my-source-app.yaml
53+
```
54+
### Modifying the event specifications source applications
55+
By default event specifications inherit all the source applications of the data product. If you want to customise it, you can use the `excludedSourceApplications` in the event specification description to remove a given source application from an event specification.
56+
```yaml title="./data-products/test-cli.yaml"
57+
apiVersion: v1
58+
resourceType: data-product
59+
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0
60+
data:
61+
name: test-cli
62+
sourceApplications:
63+
- $ref: ./source-apps/generic.yaml
64+
- $ref: ./source-apps/specific.yaml
65+
eventSpecifications:
66+
- resourceName: 11d881cd-316e-4286-b5d4-fe7aebf56fca
67+
name: All source apps
68+
event:
69+
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0
70+
- resourceName: b9c994a0-03b2-479c-b1cf-7d25c3adc572
71+
name: Not quite everything
72+
excludedSourceApplications:
73+
- $ref: ./source-apps/specific.yaml
74+
event:
75+
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0
76+
```
77+
In this example event specification `All source apps` is related to both `generic` and `specific` source apps, but event specification `Not quite everything` is related only to the `generic` source application.
78+
### Downloading data products, event specifications and source apps
79+
```bash
80+
./snowplow-cli dp download
81+
```
82+
This command retrieves all organization data products, event specifications, and source applications. By default, it creates a folder named `data-products` in your current working directory. You can specify a different folder name as an argument if needed.
83+
The command creates the following structure:
84+
- A main `data-products` folder containing your data product files
85+
- A `source-apps` subfolder containing source application definitions
86+
- Event specifications embedded within their related data product files.
87+
### Validating data products, event specifications and source applications
88+
```bash
89+
./snowplow-cli dp validate
90+
```
91+
This command scans all files under `./data-products` and validates them using the BDP console. It checks:
92+
1. Whether each file is in a valid format (YAML/JSON) with correctly formatted fields
93+
2. Whether all source application references in the data product files are valid
94+
3. Whether event specification rules are compatible with their schemas
95+
If validation fails, the command displays the errors in the console and exits with status code 1.
96+
### Publishing data products, event specifications and source applications
97+
```bash
98+
./snowplow-cli dp publish
99+
```
100+
This command locates all files under `./data-products`, validates them, and publishes them to the BDP console.

docs/data-product-studio/data-quality/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "Data quality"
33
date: "2024-12-04"
4-
sidebar_position: 7
4+
sidebar_position: 8
55
---
66

77
There are a number of ways you can test and QA your pipeline to follow good data practices.

docs/data-product-studio/data-structures/manage/cli/index.md

+2-69
Original file line numberDiff line numberDiff line change
@@ -13,78 +13,11 @@ import Tabs from '@theme/Tabs';
1313
import TabItem from '@theme/TabItem';
1414
```
1515

16-
The `data-structures` subcommand of [Snowplow CLI](https://github.com/snowplow-product/snowplow-cli) provides a collection of functionality to ease the integration of custom development and publishing workflows.
16+
The `data-structures` subcommand of [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md) provides a collection of functionality to ease the integration of custom development and publishing workflows.
1717

1818
## Snowplow CLI Prerequisites
1919

20-
### Install
21-
22-
Snowplow CLI can be installed with [homebrew](https://brew.sh/)
23-
```
24-
brew install snowplow-product/taps/snowplow-cli
25-
```
26-
27-
Verify the installation with
28-
```
29-
snowplow-cli --help
30-
```
31-
32-
For systems where homebrew is not available binaries for multiple platforms can be found in [releases](https://github.com/snowplow-product/snowplow-cli/releases).
33-
34-
Example installation for `linux_x86_64` using `curl`
35-
36-
```bash
37-
curl -L -o snowplow-cli https://github.com/snowplow-product/snowplow-cli/releases/latest/download/snowplow-cli_linux_x86_64
38-
chmod u+x snowplow-cli
39-
```
40-
41-
Verify the installation with
42-
```
43-
./snowplow-cli --help
44-
```
45-
46-
### Configure
47-
48-
You will need three values.
49-
50-
API Key Id and API Key Secret are generated from the [credentials section](https://console.snowplowanalytics.com/credentials) in BDP Console.
51-
52-
Organization Id can be retrieved from the URL immediately following the .com when visiting BDP console:
53-
54-
![](images/orgID.png)
55-
56-
Snowplow CLI can take its configuration from a variety of sources. More details are available from `./snowplow-cli data-structures --help`. Variations on these three examples should serve most cases.
57-
58-
<Tabs groupId="config">
59-
<TabItem value="env" label="env variables" default>
60-
61-
```bash
62-
SNOWPLOW_CONSOLE_API_KEY_ID=********-****-****-****-************
63-
SNOWPLOW_CONSOLE_API_KEY=********-****-****-****-************
64-
SNOWPLOW_CONSOLE_ORG_ID=********-****-****-****-************
65-
```
66-
67-
</TabItem>
68-
<TabItem value="defaultconfig" label="$HOME/.config/snowplow/snowplow.yml" >
69-
70-
```yaml
71-
console:
72-
api-key-id: ********-****-****-****-************
73-
api-key: ********-****-****-****-************
74-
org-id: ********-****-****-****-************
75-
```
76-
77-
</TabItem>
78-
<TabItem value="args" label="inline arguments" >
79-
80-
```bash
81-
./snowplow-cli data-structures --api-key-id ********-****-****-****-************ --api-key ********-****-****-****-************ --org-id ********-****-****-****-************
82-
```
83-
84-
</TabItem>
85-
</Tabs>
86-
87-
Snowplow CLI defaults to yaml format. It can be changed to json by either providing a `--output-format json` flag or setting the `output-format: json` config value. It will work for all commands where it matters, not only for `generate`.
20+
Installed and configured [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md)
8821

8922

9023
## Available commands

docs/data-product-studio/data-structures/version-amend/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: "Versioning data structures"
33
date: "2020-02-25"
44
sidebar_position: 2
5-
sidebar_label: "Verson and amend"
5+
sidebar_label: "Version and amend"
66
---
77

88
Snowplow is designed to make it easy for you to change your tracking design in a safe and backwards-compatible way as your organisational data needs evolve.
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: Snowplow CLI
3+
sidebar_label: Snowplow CLI
4+
sidebar_position: 7
5+
---
6+
import Tabs from '@theme/Tabs';
7+
import TabItem from '@theme/TabItem';
8+
9+
`snowplow-cli` brings data management elements of Snowplow Console into the command line. It allows you to download your data structures and data products to yaml/json files and publish them back to console. This enables git-ops-like workflows, with reviews and brancing.
10+
11+
# Install
12+
13+
Snowplow CLI can be installed with [homebrew](https://brew.sh/):
14+
```
15+
brew install snowplow-product/taps/snowplow-cli
16+
```
17+
18+
Verify the installation with
19+
```
20+
snowplow-cli --help
21+
```
22+
23+
For systems where homebrew is not available binaries for multiple platforms can be found in [releases](https://github.com/snowplow-product/snowplow-cli/releases).
24+
25+
Example installation for `linux_x86_64` using `curl`
26+
27+
```bash
28+
curl -L -o snowplow-cli https://github.com/snowplow-product/snowplow-cli/releases/latest/download/snowplow-cli_linux_x86_64
29+
chmod u+x snowplow-cli
30+
```
31+
32+
Verify the installation with
33+
```
34+
./snowplow-cli --help
35+
```
36+
37+
# Configure
38+
39+
You will need three values.
40+
41+
An API Key Id and the corresponding API Key (secret), which are generated from the [credentials section](https://console.snowplowanalytics.com/credentials) in BDP Console.
42+
43+
The organization ID, which can be retrieved from the URL immediately following the .com when visiting BDP console:
44+
45+
![](./images/orgID.png)
46+
47+
Snowplow CLI can take its configuration from a variety of sources. More details are available from `./snowplow-cli data-structures --help`. Variations on these three examples should serve most cases.
48+
49+
<Tabs groupId="config">
50+
<TabItem value="env" label="env variables" default>
51+
52+
```bash
53+
SNOWPLOW_CONSOLE_API_KEY_ID=********-****-****-****-************
54+
SNOWPLOW_CONSOLE_API_KEY=********-****-****-****-************
55+
SNOWPLOW_CONSOLE_ORG_ID=********-****-****-****-************
56+
```
57+
58+
</TabItem>
59+
<TabItem value="defaultconfig" label="$HOME/.config/snowplow/snowplow.yml" >
60+
61+
```yaml
62+
console:
63+
api-key-id: ********-****-****-****-************
64+
api-key: ********-****-****-****-************
65+
org-id: ********-****-****-****-************
66+
```
67+
68+
</TabItem>
69+
<TabItem value="args" label="inline arguments" >
70+
71+
```bash
72+
./snowplow-cli data-structures --api-key-id ********-****-****-****-************ --api-key ********-****-****-****-************ --org-id ********-****-****-****-************
73+
```
74+
75+
</TabItem>
76+
</Tabs>
77+
78+
Snowplow CLI defaults to yaml format. It can be changed to json by either providing a `--output-format json` flag or setting the `output-format: json` config value. It will work for all commands where it matters, not only for `generate`.
79+
80+
81+
# Use cases
82+
83+
- [Manage your data structures with snowplow-cli](/docs/data-product-studio/data-structures/manage/cli/index.md)
84+
- [Set up a github CI/CD pipeline to manage data structures and data products](/docs/resources/recipes-tutorials/recipe-data-structures-in-git/index.md)

0 commit comments

Comments
 (0)