Skip to content

Commit

Permalink
Trac #24655: Automatically build docker images with CircleCI/GitLab CI
Browse files Browse the repository at this point in the history
It would be nice to update our docker images automatically through
continuous integration services. Of course it's good to have these
images up-to-date without manual intervention but this is also
convenient as a starting point for people who want to use CI for their
own branches of Sage (besides the patchbot.)¹

This ticket proposes recipes for GitLab CI and CircleCI to build our
docker images automatically. On the respective websites, the CI can be
configured to push automatically to the Docker Hub. A webhook (on
github) updates the README on Docker Hub automatically.

I implemented this for both GitLab CI and CircleCI. I think GitLab CI is
more relevant in the long  run, also it's open source and people can
provision their own machines as runners. CircleCI at the same time works
out of the box for Sage without private test runners and it also allows
for easier debugging as you can logon to the machine running your tests
with SSH. I tried to share most code between the two implementations.

See also sagemath/docker-images#13 and
sagemath/sage-binder-env#3 for a followup
(automatically provide jupyter notebooks for easier review.)

----

Here are some numbers and screenshots (click on the screenshots to go to
the real pages):

=== GitLab CI

If I provision my own runner from Google Cloud with two threads, it
takes about 5 hours to build Sage from scratch, run rudimentary tests on
the docker images, and upload them to Docker Hub and GitLab's registry.

[[Image(gitlab.png, 640, center,
link=https://gitlab.com/saraedum/sage/pipelines)]]

Recycling the build artifacts from the last run on the develop branch
brings this down to about **??** minutes (on GitLab's free shared
runners with two threads.) This roughly breaks down as:
* **32** minutes for `build-from-latest:
  * **10** minutes for the actual build (most of which is spent in the
docbuild; caused by a known Sphinx bug to some extent)
  * **??** minutes are spent pulling the sagemath-dev image from Docker
Hub (this usually goes away if you provision your own runners and expose
the host's docker daemon by setting `DOCKER_HOST`.)
  * a few minutes running through all the fast stages of the Dockerfile.
  * a few minutes to push the resulting images to GitLab's registry.
(using GitLab's `cache`, this could probably be improved, at least for
runners that we provision ourselves.)
* **5** - **15** minutes for each test (run in parallel,); the relevant
test is `test-dev.sh` which spents 6 minutes in the actual docbuild
(just as in `build-from-latest`) and some 5 minutes to pull the
sagemath-dev image from the GitLab registry. (That part should go away
with a provisioned runner that exposes the host's docker daemon.)
* **??** minutes for the publishing to Docker Hub, most of which is
spent pulling the images from the GitLab registry, and the other half
pushing them to Docker Hub roughly. (Again, exposing the host's docker
daemon would probably cut that time in half.)

With some tricks we could probably bring this down to 25 minutes (see
CircleCI below) but we won't get this down to this without giving up on
the CI being split up into different stages (as is for technical reasons
necessary for CircleCI.) To go well below that, we would need to pull
binary packages from somewhere…I don't see a sustainable way of doing
this with the current SPKG system.

[[Image(gitlab-rebuild.png, 640, center,
link=https://gitlab.com/saraedum/sage/pipelines/18026318)]]

=== CircleCI

It typically takes almost **5** hours to build Sage from scratch on
CircleCI, run rudimentary tests on the docker images, and upload them to
Docker Hub.

[[Image(circleci.png, 640, center,
link=https://circleci.com/gh/saraedum/workflows/sage)]]

Recycling the build artifacts from the last run on the develop branch
brings this down to about **30** minutes usually. 5 minutes could be
saved by not testing the `sagemath-dev` and probably another minute or
two if we do not build it at all. To go significantly below 15 minutes
is probably hard with the huge sage-the-distribution (7GB
uncompressed/2GB compressed) that we have to pull every time at the
moment.

[[Image(circleci-rebuild.png, 640, center,
link=https://circleci.com/gh/saraedum/workflows/sage)]]

=== Docker Hub

A push to github updates the README on the Docker Hub page. The current
sizes are [[Image(https://img.shields.io/microbadger/image-
size/sagemath/sagemath/latest.svg)]] and
[[Image(https://img.shields.io/microbadger/image-size/sagemath/sagemath-
dev/latest.svg)]]; unfortunately MicroBadger is somewhat unstable so
these numbers are incorrectly reported as 0 sometimes.

[[Image(dockerhub.png, 640, center,
link=https://hub.docker.com/r/sagemath/sagemath)]]

----

Here are some things that we need to test before merging this:

* [x] build-from-clean works in the sagemath namespace, building a tag
on GitLab, https://gitlab.com/saraedum/sage/pipelines/25831229
* [x] build-from-clean works in the sagemath namespace, building from
develop on GitLab, https://gitlab.com/saraedum/sage/pipelines/25831675
* [x] build-from-clean works in a user namespace on CircleCI,
https://circleci.com/workflow-run/4ae6af8c-2212-4724-a865-a401be4bd8b7;
this does not work reliably as it often times out after 5 hours. If we
can manage to use more packages from the system, then we should be able
to move this below CircleCI's time limit.
* [x] build-from-latest works and is fast in a user namespace on GitLab,
https://gitlab.com/saraedum/sage/pipelines/25894653
* [x] build-from-latest works and is fast in a user namespace on
CircleCI, https://circleci.com/workflow-
run/5bad5fe0-f817-4174-b0b4-de7d1be3b01c

----

After this ticket has been merged, the following steps are necessary:

* ~~Setup an account for sagemath on Circle CI.~~
* Add Docker Hub credentials on ~~Circle CI or~~ GitLab.

To see a demo of what the result looks like, go to
https://hub.docker.com/r/sagemath/sagemath/. The CircleCI runs can be
seen here https://circleci.com/gh/saraedum/sage, and the GitLab CI runs
are here https://gitlab.com/saraedum/sage/pipelines.

----

¹: I want to run unit tests of an external Sage package,
https://github.com/swewers/MCLF. Being able to build a custom docker
image which contains some not-yet-merged tickets makes this much easier.

PS: Long-term one could imagine this to be the first step to replace the
patchbot with a solution that we do not have to maintain so much
ourselves, such as gitlab-runners. This is of course outside of the
scope of this ticket but having a bunch of working CI files in our
repository might inspire people to script some other tasks in a
reproducible and standardized way.

URL: https://trac.sagemath.org/24655
Reported by: saraedum
Ticket author(s): Julian Rüth
Reviewer(s): Erik Bray
  • Loading branch information
Release Manager authored and vbraun committed Aug 25, 2018
2 parents f60348f + ac6201a commit c66273f
Show file tree
Hide file tree
Showing 32 changed files with 1,242 additions and 65 deletions.
9 changes: 9 additions & 0 deletions .ci/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Continuous Integration (CI)

We support several implementations of CI. All these implementations rely on
[docker](https://docker.com) in some way. This directory contains bits which
are shared between these CI implementations. The relevant docker files can be
found in `/docker/`.

* [CircleCI](https://circleci.com) is configured in `/.circleci/`.
* [GitLab CI](https://gitlab.com) is configured in `/.gitlab-ci.yml`.
50 changes: 50 additions & 0 deletions .ci/build-docker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/sh

# This script gets called from CI to build several flavours of docker images
# which contain Sage.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

# We speed up the build process by copying built artifacts from ARTIFACT_BASE
# during docker build. See /docker/Dockerfile for more details.
ARTIFACT_BASE=${ARTIFACT_BASE:-sagemath/sagemath-dev:develop}

# Seed our cache with $ARTIFACT_BASE if it exists.
docker pull "$ARTIFACT_BASE" > /dev/null || true

docker_build() {
# Docker's --cache-from does not really work with multi-stage builds: https://github.com/moby/moby/issues/34715
# So we just have to rely on the local cache.
time docker build -f docker/Dockerfile \
--build-arg "MAKEOPTS=${MAKEOPTS}" --build-arg "SAGE_NUM_THREADS=${SAGE_NUM_THREADS}" --build-arg "MAKEOPTS_DOCBUILD=${MAKEOPTS}" --build-arg "SAGE_NUM_THREADS_DOCBUILD=${SAGE_NUM_THREADS_DOCBUILD}" --build-arg ARTIFACT_BASE=$ARTIFACT_BASE $@
}

# We use a multi-stage build /docker/Dockerfile. For the caching to be
# effective, we populate the cache by building the run/build-time-dependencies
# and the make-all target. (Just building the last target is not enough as
# intermediate targets could be discarded from the cache [depending on the
# docker version] and therefore the caching would fail for our actual builds
# below.)
docker_build --target run-time-dependencies --tag run-time-dependencies:$DOCKER_TAG .
docker_build --target build-time-dependencies --tag build-time-dependencies:$DOCKER_TAG .
docker_build --target make-all --tag make-all:$DOCKER_TAG .

# Build the release image without build artifacts.
docker_build --target sagemath --tag "$DOCKER_IMAGE_CLI" .
# Display the layers of this image
docker history "$DOCKER_IMAGE_CLI"
# Build the developer image with the build artifacts intact.
# Note: It's important to build the dev image last because it might be tagged as ARTIFACT_BASE.
docker_build --target sagemath-dev --tag "$DOCKER_IMAGE_DEV" .
# Display the layers of this image
docker history "$DOCKER_IMAGE_DEV"
23 changes: 23 additions & 0 deletions .ci/describe-system.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/sh

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set +e -x

docker info
docker run docker sh -c "
set -x
uname -a
df -h
cat /proc/cpuinfo
cat /proc/meminfo
cat /proc/sys/vm/overcommit_memory
cat /proc/sys/vm/overcommit_ratio"
45 changes: 45 additions & 0 deletions .ci/head-tail.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/sh

OFFSET=1024
# This script reads from stdin and prints to stdout as long as a the output
# does not exceed a certain number of bytes. When reading an EOF it prints the
# last $OFFSET lines if they have not been printed normally already.
# This script expects one argument, the number of bytes.

# Heavily inspired by a simlar strategy in make, https://stackoverflow.com/a/44849696/812379.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

stdbuf -i0 -o0 -e0 awk -v limit=$1 -v firstMissingNR=-1 -v offset=$OFFSET -v bytes=0 \
'{
if (bytes < limit) {
# this probably gets multi-byte characters wrong, but that probably does
# not matter for our purposes. (We add 1 for a UNIX newline.)
bytes += length($0) + 1;
print;
} else {
if (firstMissingNR == -1){
print "[…output truncated…]";
firstMissingNR = NR;
}
a[NR] = $0;
delete a[NR-offset];
printf "." > "/dev/stderr"
}
}
END {
if (firstMissingNR != -1) {
print "" > "/dev/stderr";
for(i = NR-offset+1 > firstMissingNR ? NR-offset-1 : firstMissingNR; i<=NR ; i++){ print a[i]; }
}
}
'

40 changes: 40 additions & 0 deletions .ci/protect-secrets.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/bin/sh

# This script protects all environment variables that start with "SECRET_".
# It puts them in a temporary file. The name of the variable contains the path
# of that file. This filename can then safely be used in `cat` even if `set
# -x` has been turned on. Also you can run "export" to understand the
# environment without danger.
# Be careful, however, not to use this like the following:
# docker login $DOCKER_USER $(cat $SECRET_DOCKER_PASS)
# as this would expose the password if `set -x` has been turned on.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -eo pipefail
set +x

function encrypt {
RET=`mktemp`
eval " echo \$$1" > "$RET"
echo $RET
}

for name in `awk 'END { for (name in ENVIRON) { print name; } }' < /dev/null`; do
case "$name" in
SECRET_*)
export $name="$(encrypt $name)"
echo "Protected $name"
;;
esac
done

unset encrypt
30 changes: 30 additions & 0 deletions .ci/pull-gitlab.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/sh

# This script gets called from CI to pull the Sage docker images that were
# built during the "build" phase to pull all the connected docker daemon
# (likely a docker-in-docker.)
# This script expects a single parameter, the base name of the docker image
# such as sagemath or sagemath-dev.
# The variable $DOCKER_IMAGE is set to the full name of the pulled image;
# source this script to use it.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

# Pull the built images from the gitlab registry and give them the original
# names they had after built.
# Note that "set -x" prints the $CI_BUILD_TOKEN here but GitLab removes it
# automatically from the log output.
docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
docker pull $CI_REGISTRY_IMAGE/$1:$DOCKER_TAG
export DOCKER_IMAGE="${DOCKER_NAMESPACE:-sagemath}/$1:$DOCKER_TAG"
docker tag $CI_REGISTRY_IMAGE/$1:$DOCKER_TAG $DOCKER_IMAGE
29 changes: 29 additions & 0 deletions .ci/push-dockerhub.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/sh

# This script gets called from CI to push our docker images to
# $DOCKER_NAMESPACE/sagemath* on the Docker Hub.
# This script expects a single parameter, the base name of the docker image
# such as sagemath or sagemath-dev.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

[ -z "$DOCKER_TAG" ] && (echo "Can not push untagged build."; exit 0)

# Push the built images to the docker hub (and fail silently if
# DOCKER_USER/SECRET_DOCKER_PASS have not been configured.)
if [ -z "$DOCKER_USER" -o -z "$SECRET_DOCKER_PASS" ]; then
echo "DOCKER_USER/SECRET_DOCKER_PASS variables have not been configured in your Continuous Integration setup. Not pushing built images to Docker Hub."
else
cat "$SECRET_DOCKER_PASS" | docker login -u $DOCKER_USER --password-stdin
docker push ${DOCKER_NAMESPACE:-sagemath}/$1:$DOCKER_TAG
fi
25 changes: 25 additions & 0 deletions .ci/push-gitlab.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/sh

# This script gets called from CI to push our docker images to registry
# configured in GitLab. (Mostly, so we can pull them again to push them to the
# Docker Hub.)
# This script expects a single parameter, the base name of the docker image
# such as sagemath or sagemath-dev.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

# Note that "set -x" prints the $CI_BUILD_TOKEN here but GitLab removes it
# automatically from the log output.
docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
docker tag ${DOCKER_NAMESPACE:-sagemath}/$1:$DOCKER_TAG $CI_REGISTRY_IMAGE/$1:$DOCKER_TAG
docker push $CI_REGISTRY_IMAGE/$1:$DOCKER_TAG
65 changes: 65 additions & 0 deletions .ci/setup-make-parallelity.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#!/bin/sh

# Source this to set CPUTHREADS (the number of apparent cores) and RAMTHREADS
# (free RAM divided by the maximum amount needed per thread typically)
# From this this script infers reasonable defaults for SAGE_NUM_THREADS and
# MAKEOPTS.

# We do exactly the same for CPUTHREADS_DOCBUILD, RAMTHREADS_DOCBUILD,
# SAGE_NUM_THREADS_DOCBUILD, MAKEOPTS_DOCBUILD. As the docbuild needs
# substantially more RAM as of May 2018.

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

if [ -z "$CPUTHREADS" ]; then
# Determine the number of threads that can run simultaneously on this system
# (we might not have nproc available.)
# Note that this value is incorrect for some CI providers (notably CircleCI:
# https://circleci.com/docs/2.0/configuration-reference/#resource_class) which
# provision fewer vCPUs than shown in /proc/cpuinfo. So it is probably better
# to set CPUTHREADS manuall in your CI configuration.
CPUTHREADS=`docker run docker cat /proc/cpuinfo | grep -E '^processor' | wc -l`
fi
if [ -z "$CPUTHREADS_DOCBUILD" ]; then
CPUTHREADS_DOCBUILD=$CPUTHREADS
fi

if [ -z "$RAMTHREADS" ]; then
RAMTHREADS=$(( `docker run docker cat /proc/meminfo | grep MemTotal | awk '{ print $2 }'` / 1048576 ))
if [ $RAMTHREADS = 0 ];then
RAMTHREADS=1;
fi
fi
if [ -z "$RAMTHREADS_DOCBUILD" ]; then
RAMTHREADS_DOCBUILD=$(( `docker run docker cat /proc/meminfo | grep MemTotal | awk '{ print $2 }'` / 2097152 ))
if [ $RAMTHREADS_DOCBUILD = 0 ];then
RAMTHREADS_DOCBUILD=1;
fi
fi

# On CI machines with their virtual CPUs, it seems to be quite beneficial to
# overcommit on CPU usage. We only need to make sure that we do not exceed RAM
# (as there is no swap.)
if [ $CPUTHREADS -lt $RAMTHREADS ]; then
export SAGE_NUM_THREADS=$((CPUTHREADS + 1))
else
export SAGE_NUM_THREADS=$RAMTHREADS
fi
if [ $CPUTHREADS_DOCBUILD -lt $RAMTHREADS_DOCBUILD ]; then
export SAGE_NUM_THREADS_DOCBUILD=$((CPUTHREADS_DOCBUILD + 1))
else
export SAGE_NUM_THREADS_DOCBUILD=$RAMTHREADS_DOCBUILD
fi
# Set -j and -l for make (though -l is probably ignored by Sage)
export MAKEOPTS="-j $SAGE_NUM_THREADS -l $((CPUTHREADS - 1)).8"
export MAKEOPTS_DOCBUILD="-j $SAGE_NUM_THREADS_DOCBUILD -l $((CPUTHREADS_DOCBUILD - 1)).8"
30 changes: 30 additions & 0 deletions .ci/test-cli.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/sh

# This script gets called from CI to run minimal tests on the sagemath image.

# Usage: ./test-cli.sh IMAGE-NAME

# ****************************************************************************
# Copyright (C) 2018 Julian Rüth <julian.rueth@fsfe.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
# http://www.gnu.org/licenses/
# ****************************************************************************

set -ex

echo "Checking that Sage starts and can calculate 1+1…"
# Calculate 1+1 (remove startup messages and leading & trailing whitespace)
TWO=`docker run "$1" sage -c "'print(1+1)'" | tail -1 | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'`
[ "x$TWO" = "x2" ]

echo "Checking that some binaries that should be distributed with Sage are on the PATH…"
# We could also run minimal tests on these but we don't yet.
# Check that Singular and GAP are present
docker run "$1" which Singular
docker run "$1" which gap
# Check that jupyter is present (for binder)
docker run "$1" which jupyter
Loading

0 comments on commit c66273f

Please sign in to comment.