Scaling Gitlab runners

Gitlab documents a method for auto-scaling a gitlab-runner. But what if you are not able to apply Docker Machine concepts to your cluster, or don’t need autoscaling and are content to scale manually? Read on for my solution to simple, manual scaling of Gitlab runners in a Docker swarm.

The Gitlab runner autoscaling relies on Docker Machine which works great if you have complete access to the administration of your cluster and can bring up VMs at will. There is also a “general” mode for Docker Machine that uses ssh, but it in turn relies on passwordless sudo access which isn’t always available either.

Or perhaps you just don’t need autoscaling and want to be able to manually control the number of runners available to your Gitlab CI system.

My solution to this problem involves deriving a new Docker image from the official Gitlab image and overriding the entrypoint to run a new script that registers and unregisters the runner with startup and shutdown respectively.

Here’s the new Dockerfile.

FROM gitlab/gitlab-runner:v9.0.2

COPY run.sh /run.sh

RUN set -x \
 && chmod 755 /run.sh

# Override the entrypoint from the parent image to provide registration and 
# de-registration at container start & stop.
ENTRYPOINT [ "/run.sh" ]

It’s really a very simple Dockerfile. It just copies in the new run.sh script, sets execute permissions and defines the new entrypoint pointing to the run script.

Here’s the run script.

#!/bin/bash
set -e
set -x

[[ ! -f /etc/gitlab-runner/config.toml ]] || initialize=yes

kill_runner() {
  /usr/bin/gitlab-runner \
    unregister \
      --name ${RUNNER_NAME}
}
trap kill_runner EXIT

if [[ $initialize = yes ]]; then
  /usr/bin/gitlab-runner \
    register \
      --non-interactive \
      --registration-token ${RUNNER_CI_TOKEN} \
      --url ${RUNNER_CI_URL} \
      --name ${RUNNER_NAME} \
      --executor docker \
	    --run-untagged \
      --docker-privileged \
	    --docker-image ${RUNNER_IMAGE} \
	    --docker-volumes /var/run/docker.sock:/var/run/docker.sock \
	    --tag-list ${RUNNER_TAGS}
fi

/usr/bin/gitlab-runner \
  run \
    --user=gitlab-runner \
    --working-directory=/home/gitlab-runner

The script checks for the presence of a config.toml, the presence of which suggests that initializations has already occurred. If you don’t use volumes at all, then this will be every time the image is run. If initialization is needed, then the script runs the gitlab-runner registration. In this case I have a bunch of properties about the runner that are relatively static, such as using docker-privileged mode for docker-in-docker, so I have defined them here. Other properties that are not so static I have defined as environment variables that can be provisioned as part of the docker run command, or in the compose file.

By trapping the EXIT signal of the container to run the kill_runner function, the script unregisters the runner. This means that using Docker scaling to add runners, the new runners will be automatically registered with the Gitlab instance, and scaling down, then the runners will be automatically unregistered from Gitlab. Nice.

Here’s an example compose file for use in Docker Swarm.

version: '3.0'

services:
  swarm-gitlab-runner:
    deploy:
      restart_policy:
        condition: any
        
    environment:
      RUNNER_CI_TOKEN: <Gitlab instance runner registration token>
      RUNNER_CI_URL: <Gitlab instance CI URL>
      
      RUNNER_NAME: gitlab-runner-
      # Default image used by runners
      RUNNER_IMAGE: ubuntu:16.04
      RUNNER_TAGS: 'docker,linux'
      
    image: my-new-runner:1
    networks:
      - primary-network
    
networks:
  primary-network:
    external: true

You can see that the environment variables used by the run script are being delivered by the compose file. If we used this for Continuous Deployment, those environment variables could in turn be provided by our CD build system.

Questions?