Lessons Learned with Dockerfiles and
Docker Builds
Aaron Kalin
Tech Evangelist
Slide 2
Slide 2 text
Quick Introduction
Who is this person with the red hair?
Slide 3
Slide 3 text
Hi! I’m Aaron
– Tech Evangelist with Datadog
– Software developer and
Sysadmin for over 20 years
– Active maintainer on
Ecommerce Workshop
– Social media: @martinisoft
Slide 4
Slide 4 text
Overview
Slide 5
Slide 5 text
Overview
– What is a Dockerfile anyway?
– 7 Lessons Learned via Dockerfiles
– Going further with Buildkit and buildx
– Takeaways and Resources
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
Dockerfiles Crash Course Part 1
– It’s a lot like a Makefile
– Each instruction (FROM, RUN,
COPY, ENTRYPOINT) all
creates a layer
– The beginning of a Dockerfile
is always FROM
– Usually an operating system
image
Slide 8
Slide 8 text
Dockerfiles Crash Course Part 2
– Each layer generates a cache
– If the layer instruction doesn’t
change the cache won’t
invalidate (Usually)
– Each command is generally
straightforward
– The simplest container can just
be a FROM, but you probably
want an ENTRYPOINT and
CMD
Slide 9
Slide 9 text
Dockerfiles Crash Course Part 3
– You start as root
– Unless you use USER to change this, you are always assumed to be root.
– Your working directory is always /
– To change the default assumed directory you use WORKDIR.
– EXPOSE your ports
– Only if you need to. By default it’s all closed down.
– Use ENV or ARG to pass in arguments or store re-used arguments
– Remember that these are saved in each layer so NO SECRETS!
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
Lesson 1: Be Careful of the Base
Slide 12
Slide 12 text
Lesson 1: Be Careful of the Base
(Unless it’s Ace of Base)
– Alpine images can be problematic
– Compatibility issues from musl vs glibc
– Not always a version guarantee for packages
– Prefer “slim” images where possible
– Smaller by default (less is more!)
– Less surprises or upgrades needed
Slide 13
Slide 13 text
Lesson 2: Chain your RUN commands
Slide 14
Slide 14 text
Lesson 2: Chain your RUN commands
– Layer efficiency for the win
– Remember each command creates a layer
– Setup dependency packages as early as possible
– Sort your dependency package names
Slide 15
Slide 15 text
Lesson 3: Cleanup is good
Slide 16
Slide 16 text
Lesson 3: Cleanup is good
– Again, Layer efficiency
– Can drastically reduce resulting image size
– Any cleanup from that layer MUST BE DONE IN THAT LAYER
Slide 17
Slide 17 text
Lesson 4: Run your app dependency
installs late
Slide 18
Slide 18 text
Lesson 4: Run your app package installs
separately
– Remember the layer cache
– When a layer cache is rebuilt, all following (below) layers have to be rebuilt
– Your app packages don’t change too often, but your source code does
– Bring in your app code after this step if it’s not a compiled language
Slide 19
Slide 19 text
Lesson 5: Don’t forget your
.dockerignore
Slide 20
Slide 20 text
Lesson 5: Don’t forget your .dockerignore
– The easy way to reduce your image size
– Works like a .gitignore file to avoid copying files and directories into your
container when you do an ADD/COPY of EVERYTHING
– Always inspect your final images to see if the files there really need to be
there. They probably don’t.
Slide 21
Slide 21 text
Lesson 6: Use multi-stage builds
Slide 22
Slide 22 text
Lesson 6: Use multi-stage builds
– Your new superpower with Docker
– Technically you are doing multi-stage
builds already if you aren’t building
from scratch
– Organize your multi-stage builds to
put low activity parts earlier
– Better control of dependency changes
– Adds complexity since you now have
multiple build targets
Slide 23
Slide 23 text
Lesson 7: Label EVERYTHING
Slide 24
Slide 24 text
Lesson 7: Label EVERYTHING
– Bring out your inner office manager
– Out with the MAINTAINER and in with the more flexible LABEL
– Works well for providing metadata to Kubernetes (e.g. Datadog
Autodiscovery)
– There is a standard set of labels that I will link to at the end
Slide 25
Slide 25 text
Going further
with Buildkit and buildx
Slide 26
Slide 26 text
Buildkit is AWESOME
– Buildkit was first introduced to docker in 18.09
– Still kept behind a feature flag 😢 (Slowly merging into build)
– Enable it via DOCKER_BUILDKIT=1 environment variable per command
– Or add { “features”: { “buildkit”: true } } to your docker
config and restart
– Once enabled, check out the docker buildx command
Slide 27
Slide 27 text
But why?
– Lots of major changes
– You can pass secrets to a container at build time that won’t stay with the
container
– You can also send SSH agent keys (private repo clone in the container)
– Concurrent builds thanks to a new intermediary format
– Distributed builds via a registry build cache*
– OpenTracing of builds (Trace layer build performance with Datadog?)
– Multi-arch builds (AMD64 and ARM64? Sure!)
– Even more… I can’t list it all!
Slide 28
Slide 28 text
Takeaways
And resources
Slide 29
Slide 29 text
Don’t forget...
– Pay attention to the base
– Less is more with commands (compact them)
– Invest in multi-stage builds
– Clean up after packages and use .dockerignore
– Label ALL THE THINGS!
– Try out Buildkit via buildx
Slide 30
Slide 30 text
Resources
– Docker Docs: Search for “Buildkit”
– Ecommerce workshop project
https://github.com/DataDog/ecommerce-workshop
– OpenContainers Annotations (Label ALL THE THINGS)
https://github.com/opencontainers/image-spec/blob/master/an
notations.md
– Dive (for exploring the filesystem in each layer of a container)
https://github.com/wagoodman/dive