Skip to content

Commit ede6729

Browse files
gerardoFokko
authored andcommitted
[AIRFLOW-2499] Dockerise CI pipeline (#3393)
Airflow tests depend on many external services and other custom setup, which makes it hard for contributors to work on this codebase. CI builds have also been unreliable, and it is hard to reproduce the causes. Having contributors trying to emulate the build environment every time makes it easier to get to an "it works on my machine" sort of situation. This implements a dockerised version of the current build pipeline. This setup has a few advantages: * TravisCI tests are reproducible locally * The same build setup can be used to create a local development environment
1 parent b7f63c5 commit ede6729

38 files changed

+604
-706
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -144,3 +144,5 @@ scripts/ci/kubernetes/kube/.generated/airflow.yaml
144144
node_modules
145145
npm-debug.log*
146146
static/dist
147+
derby.log
148+
metastore_db

.travis.yml

+17-71
Original file line numberDiff line numberDiff line change
@@ -19,94 +19,40 @@
1919
sudo: true
2020
dist: trusty
2121
language: python
22-
jdk:
23-
- oraclejdk8
24-
services:
25-
- cassandra
26-
- mongodb
27-
- mysql
28-
- postgresql
29-
- rabbitmq
30-
addons:
31-
apt:
32-
packages:
33-
- slapd
34-
- ldap-utils
35-
- openssh-server
36-
- mysql-server-5.6
37-
- mysql-client-core-5.6
38-
- mysql-client-5.6
39-
- krb5-user
40-
- krb5-kdc
41-
- krb5-admin-server
42-
- oracle-java8-installer
43-
postgresql: "9.2"
44-
python:
45-
- "2.7"
46-
- "3.5"
4722
env:
4823
global:
24+
- DOCKER_COMPOSE_VERSION=1.20.0
4925
- SLUGIFY_USES_TEXT_UNIDECODE=yes
5026
- TRAVIS_CACHE=$HOME/.travis_cache/
51-
- KRB5_CONFIG=/etc/krb5.conf
52-
- KRB5_KTNAME=/etc/airflow.keytab
53-
# Travis on google cloud engine has a global /etc/boto.cfg that
54-
# does not work with python 3
55-
- BOTO_CONFIG=/tmp/bogusvalue
5627
matrix:
28+
- TOX_ENV=flake8
5729
- TOX_ENV=py27-backend_mysql
5830
- TOX_ENV=py27-backend_sqlite
5931
- TOX_ENV=py27-backend_postgres
60-
- TOX_ENV=py35-backend_mysql
61-
- TOX_ENV=py35-backend_sqlite
62-
- TOX_ENV=py35-backend_postgres
63-
- TOX_ENV=flake8
32+
- TOX_ENV=py35-backend_mysql PYTHON_VERSION=3
33+
- TOX_ENV=py35-backend_sqlite PYTHON_VERSION=3
34+
- TOX_ENV=py35-backend_postgres PYTHON_VERSION=3
6435
- TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
65-
- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
66-
matrix:
67-
exclude:
68-
- python: "3.5"
69-
env: TOX_ENV=py27-backend_mysql
70-
- python: "3.5"
71-
env: TOX_ENV=py27-backend_sqlite
72-
- python: "3.5"
73-
env: TOX_ENV=py27-backend_postgres
74-
- python: "2.7"
75-
env: TOX_ENV=py35-backend_mysql
76-
- python: "2.7"
77-
env: TOX_ENV=py35-backend_sqlite
78-
- python: "2.7"
79-
env: TOX_ENV=py35-backend_postgres
80-
- python: "2.7"
81-
env: TOX_ENV=flake8
82-
- python: "3.5"
83-
env: TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
84-
- python: "2.7"
85-
env: TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
36+
- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0 PYTHON_VERSION=3
8637
cache:
8738
directories:
8839
- $HOME/.wheelhouse/
40+
- $HOME/.cache/pip
8941
- $HOME/.travis_cache/
9042
before_install:
91-
- yes | ssh-keygen -t rsa -C your_email@youremail.com -P '' -f ~/.ssh/id_rsa
92-
- cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
93-
- ln -s ~/.ssh/authorized_keys ~/.ssh/authorized_keys2
94-
- chmod 600 ~/.ssh/*
95-
- jdk_switcher use oraclejdk8
43+
- sudo ls -lh $HOME/.cache/pip/
44+
- sudo rm -rf $HOME/.cache/pip/* $HOME/.wheelhouse/*
45+
- sudo chown -R travis.travis $HOME/.cache/pip
9646
install:
47+
# Use recent docker-compose version
48+
- sudo rm /usr/local/bin/docker-compose
49+
- curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-`uname -s`-`uname -m` > docker-compose
50+
- chmod +x docker-compose
51+
- sudo mv docker-compose /usr/local/bin
9752
- pip install --upgrade pip
98-
- pip install tox
9953
- pip install codecov
100-
before_script:
101-
- cat "$TRAVIS_BUILD_DIR/scripts/ci/my.cnf" | sudo tee -a /etc/mysql/my.cnf
102-
- mysql -e 'drop database if exists airflow; create database airflow' -u root
103-
- sudo service mysql restart
104-
- psql -c 'create database airflow;' -U postgres
105-
- export PATH=${PATH}:/tmp/hive/bin
106-
# Required for K8s v1.10.x. See
107-
# https://github.com/kubernetes/kubernetes/issues/61058#issuecomment-372764783
108-
- sudo mount --make-shared / && sudo service docker restart
10954
script:
110-
- ./scripts/ci/travis_script.sh
55+
- docker-compose --log-level ERROR -f scripts/ci/docker-compose.yml run airflow-testing /app/scripts/ci/run-ci.sh
11156
after_success:
57+
- sudo chown -R travis.travis .
11258
- codecov

CONTRIBUTING.md

+100-107
Original file line numberDiff line numberDiff line change
@@ -3,22 +3,20 @@
33
Contributions are welcome and are greatly appreciated! Every
44
little bit helps, and credit will always be given.
55

6-
## Table of Contents
7-
8-
- [TOC](#table-of-contents)
9-
- [Types of Contributions](#types-of-contributions)
10-
- [Report Bugs](#report-bugs)
11-
- [Fix Bugs](#fix-bugs)
12-
- [Implement Features](#implement-features)
13-
- [Improve Documentation](#improve-documentation)
14-
- [Submit Feedback](#submit-feedback)
15-
- [Documentation](#documentation)
16-
- [Development and Testing](#development-and-testing)
17-
- [Setting up a development environment](#setting-up-a-development-environment)
18-
- [Pull requests guidelines](#pull-request-guidelines)
19-
- [Testing on Travis CI](#testing-on-travis-ci)
20-
- [Testing Locally](#testing-locally)
21-
- [Changing the Metadata Database](#changing-the-metadata-database)
6+
# Table of Contents
7+
* [TOC](#table-of-contents)
8+
* [Types of Contributions](#types-of-contributions)
9+
- [Report Bugs](#report-bugs)
10+
- [Fix Bugs](#fix-bugs)
11+
- [Implement Features](#implement-features)
12+
- [Improve Documentation](#improve-documentation)
13+
- [Submit Feedback](#submit-feedback)
14+
* [Documentation](#documentation)
15+
* [Development and Testing](#development-and-testing)
16+
- [Setting up a development environment](#setting-up-a-development-environment)
17+
- [Running unit tests](#running-unit-tests)
18+
* [Pull requests guidelines](#pull-request-guidelines)
19+
* [Changing the Metadata Database](#changing-the-metadata-database)
2220

2321
## Types of Contributions
2422

@@ -83,57 +81,110 @@ extras to build the full API reference.
8381

8482
## Development and Testing
8583

86-
### Set up a development env using Docker
84+
### Set up a development environment
8785

88-
Go to your Airflow directory and start a new docker container. You can choose between Python 2 or 3, whatever you prefer.
86+
There are three ways to setup an Apache Airflow development environment.
8987

90-
```
91-
# Start docker in your Airflow directory
92-
docker run -t -i -v `pwd`:/airflow/ -w /airflow/ -e SLUGIFY_USES_TEXT_UNIDECODE=yes python:2 bash
88+
1. Using tools and libraries installed directly on your system.
89+
90+
Install Python (2.7.x or 3.4.x), MySQL, and libxml by using system-level package
91+
managers like yum, apt-get for Linux, or Homebrew for Mac OS at first. Refer to the [base CI Dockerfile](https://github.com/apache/incubator-airflow-ci/blob/master/Dockerfile.base) for
92+
a comprehensive list of required packages.
93+
94+
Then install python development requirements. It is usually best to work in a virtualenv:
95+
96+
```bash
97+
cd $AIRFLOW_HOME
98+
virtualenv env
99+
source env/bin/activate
100+
pip install -e .[devel]
101+
```
102+
103+
2. Using a Docker container
104+
105+
Go to your Airflow directory and start a new docker container. You can choose between Python 2 or 3, whatever you prefer.
106+
107+
```
108+
# Start docker in your Airflow directory
109+
docker run -t -i -v `pwd`:/airflow/ -w /airflow/ -e SLUGIFY_USES_TEXT_UNIDECODE=yes python:2 bash
110+
111+
# Go to the Airflow directory
112+
cd /airflow/
113+
114+
# Install Airflow with all the required dependencies,
115+
# including the devel which will provide the development tools
116+
pip install -e ".[hdfs,hive,druid,devel]"
117+
118+
# Init the database
119+
airflow initdb
120+
121+
nosetests -v tests/hooks/test_druid_hook.py
122+
123+
test_get_first_record (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
124+
test_get_records (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
125+
test_get_uri (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
126+
test_get_conn_url (tests.hooks.test_druid_hook.TestDruidHook) ... ok
127+
test_submit_gone_wrong (tests.hooks.test_druid_hook.TestDruidHook) ... ok
128+
test_submit_ok (tests.hooks.test_druid_hook.TestDruidHook) ... ok
129+
test_submit_timeout (tests.hooks.test_druid_hook.TestDruidHook) ... ok
130+
test_submit_unknown_response (tests.hooks.test_druid_hook.TestDruidHook) ... ok
131+
132+
----------------------------------------------------------------------
133+
Ran 8 tests in 3.036s
134+
135+
OK
136+
```
137+
138+
The Airflow code is mounted inside of the Docker container, so if you change something using your favorite IDE, you can directly test is in the container.
139+
140+
3. Using [Docker Compose](https://docs.docker.com/compose/) and Airflow's CI scripts.
93141

94-
# Install Airflow with all the required dependencies,
95-
# including the devel which will provide the development tools
96-
pip install -e .[devel,druid,hdfs,hive]
142+
Start a docker container through Compose for development to avoid installing the packages directly on your system. The following will give you a shell inside a container, run all required service containers (MySQL, PostgresSQL, krb5 and so on) and install all the dependencies:
97143

98-
# Init the database
99-
airflow initdb
144+
```bash
145+
docker-compose -f scripts/ci/docker-compose.yml run airflow-testing bash
146+
# From the container
147+
pip install -e .[devel]
148+
# Run all the tests with python and mysql through tox
149+
tox -e py35-backend_mysql
150+
```
100151

101-
nosetests -v tests/hooks/test_druid_hook.py
152+
### Running unit tests
102153

103-
test_get_first_record (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
104-
test_get_records (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
105-
test_get_uri (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
106-
test_get_conn_url (tests.hooks.test_druid_hook.TestDruidHook) ... ok
107-
test_submit_gone_wrong (tests.hooks.test_druid_hook.TestDruidHook) ... ok
108-
test_submit_ok (tests.hooks.test_druid_hook.TestDruidHook) ... ok
109-
test_submit_timeout (tests.hooks.test_druid_hook.TestDruidHook) ... ok
110-
test_submit_unknown_response (tests.hooks.test_druid_hook.TestDruidHook) ... ok
154+
To run tests locally, once your unit test environment is setup (directly on your
155+
system or through our Docker setup) you should be able to simply run
156+
``./run_unit_tests.sh`` at will.
111157

112-
----------------------------------------------------------------------
113-
Ran 8 tests in 3.036s
158+
For example, in order to just execute the "core" unit tests, run the following:
114159

115-
OK
160+
```
161+
./run_unit_tests.sh tests.core:CoreTest -s --logging-level=DEBUG
116162
```
117163

118-
The Airflow code is mounted inside of the Docker container, so if you change something using your favorite IDE, you can directly test is in the container.
164+
or a single test method:
119165

120-
### Set up a development env using Virtualenv
166+
```
167+
./run_unit_tests.sh tests.core:CoreTest.test_check_operators -s --logging-level=DEBUG
168+
```
121169

122-
Please install python(2.7.x or 3.4.x), mysql, and libxml by using system-level package
123-
managers like yum, apt-get for Linux, or homebrew for Mac OS at first.
124-
It is usually best to work in a virtualenv and tox. Install development requirements:
170+
To run the whole test suite with Docker Compose, do:
125171

126172
```
127-
cd $AIRFLOW_HOME
128-
virtualenv env
129-
source env/bin/activate
130-
pip install -e .[devel]
131-
tox
173+
# Install Docker Compose first, then this will run the tests
174+
docker-compose -f scripts/ci/docker-compose.yml run airflow-testing /app/scripts/ci/run-ci.sh
132175
```
133176

177+
Alternatively can also set up [Travis CI](https://travis-ci.org/) on your repo to automate this.
178+
It is free for open source projects.
179+
180+
For more information on how to run a subset of the tests, take a look at the
181+
nosetests docs.
182+
183+
See also the list of test classes and methods in `tests/core.py`.
184+
134185
Feel free to customize based on the extras available in [setup.py](./setup.py)
135186

136-
### Pull Request Guidelines
187+
## Pull Request Guidelines
137188

138189
Before you submit a pull request from your forked repo, check that it
139190
meets these guidelines:
@@ -213,64 +264,6 @@ More information:
213264
[travis-ci-open-source]: https://docs.travis-ci.com/user/open-source-on-travis-ci-com/
214265
[travis-ci-org-vs-com]: https://devops.stackexchange.com/a/4305/8830
215266

216-
### Testing locally
217-
218-
#### TL;DR
219-
220-
Tests can then be run with (see also the [Running unit tests](#running-unit-tests) section below):
221-
222-
```
223-
./run_unit_tests.sh
224-
```
225-
226-
Individual test files can be run with:
227-
228-
```
229-
nosetests [path to file]
230-
```
231-
232-
#### Running unit tests
233-
234-
We *highly* recommend setting up [Travis CI](https://travis-ci.org/) on
235-
your repo to automate this. It is free for open source projects. If for
236-
some reason you cannot, you can use the steps below to run tests.
237-
238-
Here are loose guidelines on how to get your environment to run the unit tests.
239-
We do understand that no one out there can run the full test suite since
240-
Airflow is meant to connect to virtually any external system and that you most
241-
likely have only a subset of these in your environment. You should run the
242-
CoreTests and tests related to things you touched in your PR.
243-
244-
To set up a unit test environment, first take a look at `run_unit_tests.sh` and
245-
understand that your ``AIRFLOW_CONFIG`` points to an alternate config file
246-
while running the tests. You shouldn't have to alter this config file but
247-
you may if need be.
248-
249-
From that point, you can actually export these same environment variables in
250-
your shell, start an Airflow webserver ``airflow webserver -d`` and go and
251-
configure your connection. Default connections that are used in the tests
252-
should already have been created, you just need to point them to the systems
253-
where you want your tests to run.
254-
255-
Once your unit test environment is setup, you should be able to simply run
256-
``./run_unit_tests.sh`` at will.
257-
258-
For example, in order to just execute the "core" unit tests, run the following:
259-
260-
```
261-
./run_unit_tests.sh tests.core:CoreTest -s --logging-level=DEBUG
262-
```
263-
264-
or a single test method:
265-
266-
```
267-
./run_unit_tests.sh tests.core:CoreTest.test_check_operators -s --logging-level=DEBUG
268-
```
269-
270-
For more information on how to run a subset of the tests, take a look at the
271-
nosetests docs.
272-
273-
See also the list of test classes and methods in `tests/core.py`.
274267

275268
### Changing the Metadata Database
276269

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,12 @@ unit of work and continuity.
7878

7979
![](/docs/img/code.png)
8080

81+
82+
## Contributing
83+
84+
Want to help build Apache Airflow? Check out our [contributing documentation](https://github.com/apache/incubator-airflow/blob/master/CONTRIBUTING.md).
85+
86+
8187
## Who uses Airflow?
8288

8389
As the Airflow community grows, we'd like to keep track of who is using

airflow/operators/python_operator.py

-2
Original file line numberDiff line numberDiff line change
@@ -188,10 +188,8 @@ class PythonVirtualenvOperator(PythonOperator):
188188
variable named virtualenv_string_args will be available (populated by
189189
string_args). In addition, one can pass stuff through op_args and op_kwargs, and one
190190
can use a return value.
191-
192191
Note that if your virtualenv runs in a different Python major version than Airflow,
193192
you cannot use return values, op_args, or op_kwargs. You can use string_args though.
194-
195193
:param python_callable: A python function with no references to outside variables,
196194
defined with def, which will be run in a virtualenv
197195
:type python_callable: function

0 commit comments

Comments
 (0)