While building multi-arch images I noticed that it was really unreliable when done in my CI/CD pipelines. After a bit of research, I found that the buildx and the Qemu emulation system aren’t quite stable when used with Docker in Docker. Although I can retry the job until it finishes I decided to look into other ways of doing multi-arch builds with buildx.
One of the ways that the buildx docs hint at for doing multi-arch builds is by using worker instances on native nodes. This is rather poorly described in the official documentation at https://docs.docker.com/buildx/working-with-buildx/ and slightly better on GitHub at https://github.com/docker/buildx/blob/master/docs/reference/buildx_create.md. By using native nodes to perform the builds you don’t incur any of the emulation overhead that happens when using Qemu based methods. This greatly speeds up the build process if you have the architectures you want to build for available. In my setup I have a k3s based Kubernetes cluster with a small number of amd64 based systems as well as an arm64 Pi4.
As I mentioned, the documentation is a bit lacking in examples so here is how I went about setting up builder instances for amd64 and arm64. For this to work you must have the latest version of Docker installed with experimental features enabled, explained best at https://stackoverflow.com/questions/57937733/how-to-enable-experimental-docker-cli-features. I also assume, for the purposes of this post, that you are using a Kubernetes cluster like I am with nodes for each architecture you want to build for.
Here is how I did it.
In my setup I have a k3s cluster with amd64 and arm64 nodes. In buildx these are known as platforms under the names linux/amd64 and linux/arm64 names respectively. So that I can perform a build for those platforms I created new builders and tagged them with the info required for buildx to route the builds properly. First, I set my KUBECONFIG variable to point at only my k3s config file.
Next, created a new builder for amd64:
docker buildx create --name k3s --driver kubernetes --platform linux/amd64 --driver-opt nodeselector=kubernetes.io/arch=amd64 --node k3s-amd64
Next, I added (append) to that builder another builder that targets arm64
docker buildx create --append --name k3s --driver kubernetes --platform linux/arm64 --driver-opt nodeselector=kubernetes.io/arch=arm64 --node k3s-arm64
There is a number of things happening in this command but each one is best described in the Github based documentation. In essence, we are creating a builder for each platform we’re going to use and tagging them with the right information. Note the use of
--node in these commands as well. Without this parameter, you’ll run into an issue with the builders having the same name. This isn’t described well in the documentation. If issue
docker buildx ls now you’ll see the new builder is available but not yet running. To get them running simply perform a build using buildx by first telling buildx to use your new builder and then perform a build. Something like this:
docker buildx use k3s docker buildx build -t test-build . --platform linux/amd64,linux/arm64
What you’ll see next is two new pods appear in your k3s cluster, one for each platform. Soon you’ll see build output. The rest of the build process works as you’re used if you’ve used buildx before or BuildKit.
To tear this down, and remove the pods from your k3s cluster, issue:
docker buildx rm k3s
This will remove the builder and, if they exist, the pods in your cluster.