Skip to content

Commit 1bb54ae

Browse files
kentdanasTJaniF
andauthored
Update example DAG to DevRel-owned consistent example (#19)
* Update example DAG to DevRel-owned consistent example * Apply suggestions from code review Co-authored-by: Tamara Janina Fingerlin <90063506+TJaniF@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Tamara Janina Fingerlin <90063506+TJaniF@users.noreply.github.com> * Apply suggestions from code review --------- Co-authored-by: Tamara Janina Fingerlin <90063506+TJaniF@users.noreply.github.com>
1 parent aeddff6 commit 1bb54ae

File tree

5 files changed

+82
-39
lines changed

5 files changed

+82
-39
lines changed

Dockerfile

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
1-
FROM quay.io/astronomer/astro-runtime:10.4.0
2-
ADD scripts/s3_countries_transform_script.sh /usr/local/airflow/transform_script.sh
1+
FROM quay.io/astronomer/astro-runtime:10.4.0

README.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,8 @@ Project Contents
88

99
Your Astro project contains the following files and folders:
1010

11-
- dags: This folder contains the Python files for your Airflow DAGs. By default, this directory includes two example DAGs:
12-
- `example_dag_basic`: This DAG shows a simple ETL data pipeline example with three TaskFlow API tasks that run daily.
13-
- `example_dag_advanced`: This advanced DAG showcases a variety of Airflow features like branching, Jinja templates, task groups and several Airflow operators.
11+
- dags: This folder contains the Python files for your Airflow DAGs. By default, this directory includes one example DAG:
12+
- `example_astronauts`: This DAG shows a simple ETL pipeline example that queries the list of astronauts currently in space from the Open Notify API and prints a statement for each astronaut. The DAG uses the TaskFlow API to define tasks in Python, and dynamic task mapping to dynamically print a statement for each astronaut. For more on how this DAG works, see our [Getting started tutorial](https://docs.astronomer.io/learn/get-started-with-airflow).
1413
- Dockerfile: This file contains a versioned Astro Runtime Docker image that provides a differentiated Airflow experience. If you want to execute other commands or overrides at runtime, specify them here.
1514
- include: This folder contains any additional files that you want to include as part of your project. It is empty by default.
1615
- packages.txt: Install OS-level packages needed for your project by adding them to this file. It is empty by default.

dags/example_astronauts.py

+79
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
"""
2+
## Astronaut ETL example DAG
3+
4+
This DAG queries the list of astronauts currently in space from the
5+
Open Notify API and prints each astronaut's name and flying craft.
6+
7+
There are two tasks, one to get the data from the API and save the results,
8+
and another to print the results. Both tasks are written in Python using
9+
Airflow's TaskFlow API, which allows you to easily turn Python functions into
10+
Airflow tasks, and automatically infer dependencies and pass data.
11+
12+
The second task uses dynamic task mapping to create a copy of the task for
13+
each Astronaut in the list retrieved from the API. This list will change
14+
depending on how many Astronauts are in space, and the DAG will adjust
15+
accordingly each time it runs.
16+
17+
For more explanation and getting started instructions, see our Write your
18+
first DAG tutorial: https://docs.astronomer.io/learn/get-started-with-airflow
19+
20+
![Picture of the ISS](https://www.esa.int/var/esa/storage/images/esa_multimedia/images/2010/02/space_station_over_earth/10293696-3-eng-GB/Space_Station_over_Earth_card_full.jpg)
21+
"""
22+
23+
from airflow import Dataset
24+
from airflow.decorators import dag, task
25+
from pendulum import datetime
26+
import requests
27+
28+
#Define the basic parameters of the DAG, like schedule and start_date
29+
@dag(
30+
start_date=datetime(2024, 1, 1),
31+
schedule="@daily",
32+
catchup=False,
33+
doc_md=__doc__,
34+
default_args={"owner": "Astro", "retries": 3},
35+
tags=["example"],
36+
)
37+
def example_astronauts():
38+
#Define tasks
39+
@task(
40+
#Define a dataset outlet for the task. This can be used to schedule downstream DAGs when this task has run.
41+
outlets=[Dataset("current_astronauts")]
42+
) # Define that this task updates the `current_astronauts` Dataset
43+
def get_astronauts(**context) -> list[dict]:
44+
"""
45+
This task uses the requests library to retrieve a list of Astronauts
46+
currently in space. The results are pushed to XCom with a specific key
47+
so they can be used in a downstream pipeline. The task returns a list
48+
of Astronauts to be used in the next task.
49+
"""
50+
r = requests.get("http://api.open-notify.org/astros.json")
51+
number_of_people_in_space = r.json()["number"]
52+
list_of_people_in_space = r.json()["people"]
53+
54+
context["ti"].xcom_push(
55+
key="number_of_people_in_space", value=number_of_people_in_space
56+
)
57+
return list_of_people_in_space
58+
59+
@task
60+
def print_astronaut_craft(greeting: str, person_in_space: dict) -> None:
61+
"""
62+
This task creates a print statement with the name of an
63+
Astronaut in space and the craft they are flying on from
64+
the API request results of the previous task, along with a
65+
greeting which is hard-coded in this example.
66+
"""
67+
craft = person_in_space["craft"]
68+
name = person_in_space["name"]
69+
70+
print(f"{name} is currently in space flying on the {craft}! {greeting}")
71+
72+
#Use dynamic task mapping to run the print_astronaut_craft task for each
73+
#Astronaut in space
74+
print_astronaut_craft.partial(greeting="Hello! :)").expand(
75+
person_in_space=get_astronauts() #Define dependencies using TaskFlow API syntax
76+
)
77+
78+
#Instantiate the DAG
79+
example_astronauts()

dags/s3_transform.py

-32
This file was deleted.

scripts/s3_countries_transform_script.sh

-2
This file was deleted.

0 commit comments

Comments
 (0)