Improve memory footprint of Cosmos DbtDag
, DbtTaskGroup
and operators
#1471
Labels
area:performance
Related to performance, like memory usage, CPU usage, speed, etc
Milestone
Context
Some users have reported high memory consumption per Airflow worker node with (and without) Cosmos, when using Airflow Celery Executors. Using Airflow Kubernetes Executor didn't help with the task-level memory consumption.
A customer shared numbers to run a trivial Airflow task is around 250 MB, without Cosmos. Using Cosmos increases the consumption to 300 MB. While running small DAGs with few parallel tasks may not be an issue, it can become a resource management issue for larger workflows.
An example of how the memory was tracked was to instantiate a sequence of
PythonOperator
s that would execute the following function:What we can do
There is a limit to what we can do at a Cosmos level on improving performance since the plain Airflow baseline is already high.
The goal of this ticket is to:
def
and not at top-level modulesThe text was updated successfully, but these errors were encountered: