This post is going to be a lot less of a full “how-to” on how to use Minio as a build cache for Gitlab and a lot more of a post discussing the general process and the tools I used. How I set up Minio to be a build cache for my Gitlab runner is a lot more complex than I care to get fully into in a blog post, but I am posting this in the hopes that it at least inspires others.

A Lot of Context

In my setup, my Gitlab runner lives in Kubernetes. As jobs are queued, the runner spawns new pods to perform the required work. Since the pods are temporary and do not have any kind of persistent storage by default, any work they do that is not pushed back to the Gitlab instance is lost the next time a similar job is run. Having a shared build cache allows you to, in some cases, reduce the total amount of time it takes to perform tasks by keeping local copies of Node modules or other heavy assets. Additionally, if you configure your cache key properly, you can pass temporary artifacts between jobs in a pipeline. In my setup, passing temporary data is what I need.

I use Gitlab for a number of my own personal projects, including maintaining this site’s codebase. While my “production” site is a Digital Ocean virtual machine, my staging site runs on Kubernetes. This means, along with other containers I build, that I need to containerize the code of my site, which also means I need to push the container images into a registry. This is done by authenticating against the container registry as part of the build pipeline.

Additionally, I am using Gitlab CI Catalog or CI/CD components. The Gitlab CI catalog is most similar to GitHub Actions, where you create reusable components that you can tie together to build a solution. I am using components that can sign into various container registries as well as build container images for different architectures. In an effort to create reusable and modular components, I split up the process of authenticating with different registries and the build process. For this to work, I must pass the cached credentials between the jobs in the pipeline. Using the shared build cache to pass the information along ensures that I can keep the credentials available for downstream jobs while keeping them out of artifacts that a regular user can access.

My Solution

For my solution, I am leveraging a number of components I already have in place. This includes k3s as my Kubernetes solution, TrueNAS Scale as my storage solution, and various other pieces to tie it together, like democratic CSI to provide persistent storage for k3s.

The new components for my solution is the Minio operator, located at https://github.com/minio/operator/tree/master/helm/operator as well as a tenant definition based on their documentation. The tenant I created is as minimal as possible using a single server without any encryption. Large scale production environments will at least want to use on-the-wire encryption.

Configuring for my runner looks like this:

config.template.toml: |
  [[runners]]
    request_concurrency = 2
    [runners.cache]
      Type = "s3"
      [runners.cache.s3]
        ServerAddress = "minio:80"
        AccessKey = "[redacted]"
        SecretKey = "[redacted]"
        BucketName = "gitlab-cache"
        Insecure = true
      Shared = true
    [runners.kubernetes]
      image = "alpine:latest"
      privileged = true
      pull_policy = "always"
      service_account = "gitlab-runner"
    [runners.kubernetes.node_selector]
      "kubernetes.io/arch" = "amd64"
    [[runners.kubernetes.volumes.empty_dir]]
      name = "docker-certs"
      mount_path = "/certs/client"
      medium = "Memory"

From this, you can see that my Minio tenant was installed with a service named minio running on port 80. I used a port forward to access the tenant and then create my access credentials and a bucket which was plugged into the runner configuration and deployed using a Helm chart for Gitlab Runner. If you are using Amazon S3 in AWS, then you can leverage AWS IAM Roles for Service Accounts and assign the correct service account to the runners to achieve the same behavior more securely.

With this configuration in place, I am able to cache Docker authentication between jobs in a pipeline. In a future post, I will more fully detail how I am doing this in the CI Catalog, but for now, here is the YAML to define the cache key:

cache:
  - key: docker-cache-$CI_PIPELINE_ID
    paths:
      - $CI_PROJECT_DIR/.docker

By setting the cache key in this way, I ensure that Docker credentials are passed between jobs for an entire pipeline. Care needs to be taken to ensure cached the information is not included in artifacts particularly if they are sensitive in nature.

Under some conditions, you may find that your Docker in Docker builds will hang our stall out, especially when you combine DIND based builds and Kubernetes. The fix for this isn’t always obvious because it doesn’t exactly announce itself. After a bit of searching, I came across a post that described the issue in great detail located at https://medium.com/@liejuntao001/fix-docker-in-docker-network-issue-in-kubernetes-cc18c229d9e5.

As described, the issue is actually due to the MTU the DIND service uses when it starts. By default, it uses 1500. Unfortunately, a lot of Kubernetes overlay networks will set a smaller MTU of around 1450. Since DIND is a service running on an overlay network it needs to use an MTU equal to or smaller than the overlay network in order to work properly. If your build process happens to download a file that is larger than the Maximum Transmission Unit then it will wait indefinitely for data that will never arrive. This is because DIND, and the app using it, thinks the MTU is 1500 when it is actually 1450.

Anyway, this isn’t about what MTU is or how it works, it’s about how to configure a Gitlab based job that is using the DIND service with a smaller MTU. Thankfully it’s easy to do.

In your .gitlab-ci.yml file where you enable the dind service add a command or parameter to pass to Gitlab, like this:

Build Image:
  image: docker
  services:
    - name: docker:dind
      command: ["--mtu 1000"]
  variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
    DOCKER_HOST: tcp://localhost:2375

This example shown will work if you are using a Kubernetes based Gitlab Runner. With this added, you should find that your build stalls go away and everything works as expected.

Arm processors, used in Raspberry Pi’s and maybe even in a future Mac, are gaining in popularity due to their reduced cost and improved power efficiency over more traditional x86 offerings. As Arm processor adoption accelerates the need for Docker images that support both x86 and Arm will become more and more a necessity. Luckily, recent releases of Docker are capable of building images for multiple architectures. In this post I will cover one way to achieve this by combining a recent release of Gitlab (12+), k3s and the buildx plugin for Docker.

I am taking inspiration for this post from two places. First, this excellent writeup was a great help in getting things start – https://dev.to/jdrouet/multi-arch-images-with-docker-152f. This post was also instrumental in getting this going – https://medium.com/@artur.klauser/building-multi-architecture-docker-images-with-buildx-27d80f7e2408.

I assume you already have a working installation of Gitlab with the container registry configured. Optionally, you can use Docker Hub but I won’t cover that in detail. Using Docker Hub involves changing the repository URL and then logging into Docker Hub. You will also need some system available capable of running k3s that is using at least Linux 4.15+. For this you can use either Ubuntu 18.04+ or CentOS 8. There may be other options but I know these two will work. The kernel version is a hard requirement and is something that caused me some headache. If I had just RTFM I could have saved myself some time. For my setup I installed k3s onto a CentOS 8 VM and then connected it to Gitlab. For information on how to setup k3s and connecting it to Gitlab please see this post.

Once you are running k3s on a system with a supported kernel you can start building multi-arch images using buildx. I have created an example project available at https://github.com/dustinrue/buildx-example that you can import into Gitlab to get you started. This example project targets a runner tagged as kubernetes to perform the build. Here is a breakdown of what the .gitlab-ci.yml file is doing:

  • Installs buildx from GitHub (https://github.com/docker/buildx) as a Docker cli plugin
  • Registers qemu binaries to emulate whatever platform you request
  • Builds the images for the requested platforms
  • Pushes resulting images up to the Gitlab Docker Registry

Unlike the linked to posts I also had to add in a docker buildx inspect --bootstrap to make things work properly. Without this the new context was never active and the builds would fail.

The example .gitlab-ci.yml builds multiple architectures. You can request what architectures to build using the --platform flag. This command, docker buildx build --push --platform linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v6 -t ${CI_REGISTRY_URL}:${CI_COMMIT_SHORT_SHA} . will cause images to be build for the listed architectures. If you need a list of available architectures you can target you can add docker buildx ls right before the build command to see a list of supported architectures.

Once the build has completed you can validate everything using docker manifest inspect. Most likely you will need to enable experimental features for your client. Your command will look similar to this DOCKER_CLI_EXPERIMENTAL=enabled docker manifest inspect <REGISTRY_URL>/drue/buildx-example:9ae6e4fb. Be sure to replace the path to the image with your image. Your output will look similar to this if everything worked properly:

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 527,
         "digest": "sha256:611e6c65d9b4da5ce9f2b1cd0922f7cf8b5ef78b8f7d6d7c02f793c97251ce6b",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 527,
         "digest": "sha256:6a85417fda08d90b7e3e58630e5281a6737703651270fa59e99fdc8c50a0d2e5",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 527,
         "digest": "sha256:30c58a067e691c51e91b801348905a724c59fecead96e645693b561456c0a1a8",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 527,
         "digest": "sha256:3243e1f1e55934547d74803804fe3d595f121dd7f09b7c87053384d516c1816a",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v6"
         }
      }
   ]
}

You should see multiple architectures listed.

I hope this is enough to get you up and running building multi-arch Docker images. If you have any questions please open an issue on Github and I’ll try to get it answered.

Whatever your reason for placing an NGINX proxy in front of your Gitlab installation, you need to ensure you’re using the right configuration to support all of Gitlab’s features. I recently discovered that although my installation was mostly working I couldn’t get pipeline/build logs properly. I discovered that my proxy configuration was to blame. After some searching around I finally found that my config wasn’t quite right. To get the most out of Gitlab and ensure a smooth experience use configuration shown below as a template for your own. In my setup I use LetsEncrypt for SSL so if you’re not you can remove any of the SSL specific parts. The important configuration information is contained the the location block.

 

upstream gitlab {
  server <ip of your gitlab server>:<port>;
}

server {
    listen          443;
    server_name     <your gitlab server hostname;

    ssl on;
    ssl_certificate <path to cert>;
    ssl_certificate_key <path to key>;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;
    server_tokens off;


    gzip on;
    gzip_vary on;
    gzip_disable "msie6";
    gzip_types application/json;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;

    location / {
       client_max_body_size   0;
       proxy_set_header    Host                $http_host;
       proxy_set_header    X-Real-IP           $remote_addr;
       proxy_set_header    X-Forwarded-Ssl     on;
       proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
       proxy_set_header    X-Forwarded-Proto   $scheme;

      proxy_pass https://gitlab;
    }
}

This configuration will properly pass all requests through to your Gitlab server as well as allow CI/CD pipeline logs to pass through properly.