Airflow needs to know how to connect to your environment. Information
such as hostname, port, login and passwords to other systems and services is
handled in the Admin->Connections
section of the UI. The pipeline code you
will author will reference the 'conn_id' of the Connection objects.
Connections can be created and managed using either the UI or environment variables.
See the :ref:`Connenctions Concepts <concepts-connections>` documentation for more information.
Open the Admin->Connections
section of the UI. Click the Create
link
to create a new connection.
- Fill in the
Conn Id
field with the desired connection ID. It is recommended that you use lower-case characters and separate words with underscores. - Choose the connection type with the
Conn Type
field. - Fill in the remaining fields. See :ref:`manage-connections-connection-types` for a description of the fields belonging to the different connection types.
- Click the
Save
button to create the connection.
Open the Admin->Connections
section of the UI. Click the pencil icon next
to the connection you wish to edit in the connection list.
Modify the connection properties and click the Save
button to save your
changes.
Connections in Airflow pipelines can be created using environment variables.
The environment variable needs to have a prefix of AIRFLOW_CONN_
for
Airflow with the value in a URI format to use the connection properly.
When referencing the connection in the Airflow pipeline, the conn_id
should be the name of the variable without the prefix. For example, if the
conn_id
is named postgres_master
the environment variable should be
named AIRFLOW_CONN_POSTGRES_MASTER
(note that the environment variable
must be all uppercase). Airflow assumes the value returned from the
environment variable to be in a URI format (e.g.
postgres://user:password@localhost:5432/master
or
s3://accesskey:secretkey@S3
).
The Google Cloud Platform connection type enables the :ref:`GCP Integrations <GCP>`.
There are two ways to connect to GCP using Airflow.
- Use Application Default Credentials, such as via the metadata server when running on Google Compute Engine.
- Use a service account key file (JSON format) on disk.
The following connection IDs are used by default.
bigquery_default
- Used by the :class:`~airflow.contrib.hooks.bigquery_hook.BigQueryHook` hook.
google_cloud_datastore_default
- Used by the :class:`~airflow.contrib.hooks.datastore_hook.DatastoreHook` hook.
google_cloud_default
- Used by the :class:`~airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook`, :class:`~airflow.contrib.hooks.gcp_dataflow_hook.DataFlowHook`, :class:`~airflow.contrib.hooks.gcp_dataproc_hook.DataProcHook`, :class:`~airflow.contrib.hooks.gcp_mlengine_hook.MLEngineHook`, and :class:`~airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook` hooks.
- Project Id (required)
- The Google Cloud project ID to connect to.
- Keyfile Path
Path to a service account key file (JSON format) on disk.
Not required if using application default credentials.
- Keyfile JSON
Contents of a service account key file (JSON format) on disk. It is recommended to :doc:`Secure your connections <secure-connections>` if using this method to authenticate.
Not required if using application default credentials.
- Scopes (comma separated)
A list of comma-separated Google Cloud scopes to authenticate with.
Note
Scopes are ignored when using application default credentials. See issue AIRFLOW-2522.