If you are building Docker images in CI, there is a good chance you will want to set up build caching - images can really take a long time to build. This can be particularly frustrating as while developing locally you have quick builds with layer caching, that you lose in an ephemeral CI job. We can speed up those builds by setting up layer caching, but it won’t work without some config.
If you are building docker images with a self-hosted runner, you have probably enabled
privileged
execution to run docker-in-docker. The example in this post is for GitLab. You should spend some time understanding what that means as it’s very different security-wise to normal docker execution of CI jobs. Using buildah or buildkit rootless means an unfamiliar build process if you build with docker locally, so this post uses docker-in-docker. As a starting point, in privileged mode the CI job container (run by gitlab-runner) will effectively execute as root on the gitlab-runner host. You would need to be extremely careful what runs in the CI job (e.g. what is in.gitlab-ci.yml
). As part of thedocker build
inside the job, the build containers will still be isolated like a normal build, so there is less risk from malicious code inside theDockerfile
, although there are still risks.
The first thing that is helpful to get your head around is that your docker build cache is separate to your images. You can delete all your images whilst maintaining your cache, or clear your cache (docker buildx prune -a
) whilst keeping your images. The build cache metadata lets docker link image layers to Dockerfile instructions, and to decide whether to rebuild a layer or not.
Docker’s legacy builder had a way to use an image for build caching, people used this in CI so a lot of resources online will suggest this and it is quite confusing, but this doesn’t work with modern docker using BuildKit. See footnote.
By default BuildKit uses a local cache storage backend which is what you are likely using when building locally; local caching isn’t any good in CI where we have an ephemeral file system. Thankfully, BuildKit also supports other cache backends for caching to an external location, which is great for CI! There is a note in the docs that say that say something about other cache storage backends needing a ‘different driver’… what?
Aside: jargon busting
I am not a massive fan of how Docker’s documentation uses conflicting bits of terminology carelessly. I had to get my head around some of jargon to get this all working. This is just some of the terminology for this post.
- docker engine: A client-server application for doing most things with docker, managing images, containers, networking etc. It includes the docker daemon (dockerd), which is the server part of that model. The
docker
CLI is normally the client. - BuildKit is the backend to the
docker build
part of the docker engine. It builds images. The reason it has a funky name is that it is a replacement to the legacy docker builder. It started out as a more powerful builder but now is the default. - build driver: sometimes just called driver in the docs when in a building context. These are different options for how BuildKit is executed.
docker
driver: The default builder, which is BuildKit bundled into the docker daemon (dockerd).docker-container
driver: if this is used, a separate BuildKit container is spawned to do the building- I think these are awful names, we are seriously overloading the word docker now. There are also
kubernetes
andremote
drivers. The docs also have storage drivers and logging drivers; they’re not related.
- buildx: Basically just a subcommand of the docker CLI to access more advanced options in BuildKit. It is a separate subcommand because for a while
buildx
let you use BuildKit while docker build used the legacy builder, but now docker build also uses BuildKit.
Back to it: external build caches
The most simple external build cache is the inline
cache. Which you can use like this:
docker buildx build --push -t <registry>/<image> \
--cache-to type=inline \
--cache-from type=registry,ref=<registry>/<image> .
With the inline
cache, metadata is stored in the image itself which is nice and simple but has two downsides:
- cache metadata is stored in the image which will bloat it (probably not massively though)
- As it is stored in your single final image, it won’t cache a multi-stage build.
I think most applications will benefit from a multi-stage build, reducing the size and security risk of the image is easily worth any added complexity for me.
Externally caching a multi-stage build
Let’s start working with an actual example, for a Go web server. We are building the Go binary, baking in a commit hash to the web server to help with debugging, and then copying to binary to a distroless image. As we aren’t using any CGO, we can use the static
distroless image. It is only 2MB, and effectively just static assets so you are about as safe from CVEs as you can get. This is a great tutorial about the distroless images.
FROM golang:1.24 AS build-stage
ARG GIT_COMMIT
WORKDIR /app
COPY go.mod ./
RUN go mod download
COPY . ./
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags "-X main.GitCommit=${GIT_COMMIT}" \
-o /server main.go
# Deploy the application binary into a lean image
FROM gcr.io/distroless/static-debian12 AS app
ARG GIT_COMMIT
LABEL git_commit=$GIT_COMMIT
WORKDIR /
COPY --from=build-stage /server /server
EXPOSE 8080
USER nonroot:nonroot
ENTRYPOINT ["/server"]
Now we have our multi-stage Dockerfile, we need to swap cache storage backends. Our best option here is the registry
backend, which puts the build cache into a separate image. Note that we can call our cache image whatever we like, and even push it to a separate registry:
docker buildx build --push -t <registry>/<image> \
--cache-to type=registry,ref=<registry>/<cache-image>[,parameters...] \
--cache-from type=registry,ref=<registry>/<cache-image> .
There are two pitfalls with the registry
cache:
- The
registry
cache (as of docker 28.3.0) is not supported by thedocker
build driver. Instead we must use thedocker-container
build driver. See jargon busting above. - It has two
mode
settings,min
andmax
. Themin
mode only caches the final image, not previous stages, so for a multi-stage build we need to opt-in tomax
. I lost quite a lot of time to this, partly due to trusting an LLM rather than the documentation.
To set up the docker-container
driver:
docker context create my-builder
docker buildx create my-builder --driver docker-container --use
Example for GitLab CI
Here is a complete .gitlab-ci.yml
, assuming you want to build on commits to main
and tag the image as main
:
build:
stage: build
image: docker:28.3.0
services:
- docker:28.3.0-dind
variables:
IMAGE_NAME: $CI_REGISTRY/myapp
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login $CI_REGISTRY -u $CI_REGISTRY_USER --password-stdin
script:
- docker context create builder-context
- docker buildx create --use --driver docker-container --name mybuilder builder-context
- |
docker buildx build --push \
-t $IMAGE_NAME:$CI_COMMIT_REF_SLUG \
--cache-to type=registry,ref=$IMAGE_NAME:cache,mode=max \
--cache-from type=registry,ref=$IMAGE_NAME:cache \
.
rules:
- if: $CI_COMMIT_BRANCH == "main"
This was mostly taken from GitLab’s docs.
$CI_COMMIT_REF_SLUG
is the branch name on branches and tag on tags, slugified.
In that CI definition we are using a single cache image tag. This means builds on git tags would benefit from cached builds on dev
, for example. Depending on your git strategy, it might make sense to have separate cache images for main and dev. You could set the cache-to/from to reference cache-main
and cache-dev
on branch builds. But for git tag builds, you could --cache-from
cache-main
assuming you tag releases from main.
Footnotes
- With the legacy docker builder, there was a
--cache-from
flag, which allowed you to specify images that the builder could attempt to use for caching. It would try and match layers up with Dockerfile instructions, but that doesn’t seem to be an option with BuildKit.