Upgrade to Pro — share decks privately, control downloads, hide ads and more …

from Dockerfail to Dockerfile

February 01, 2023

from Dockerfail to Dockerfile

"The devil is in the details" is the way I like to describe how simple yet powerful, Dockerfiles are. There are different ways of accomplishing the same thing, yet few of those routes implement good practices, and we often end up shooting ourselves in the foot.

Dockerfiles are also maintained by both Dev and Ops (shared responsibility == no responsibility!), which leads to constant debate and struggle over tooling, security, release process.

In this session we compiled 10 horror Dockerfiles don'ts, a.k.a “Dockerfails”. To end on an optimist note, we will showcase the maximum good practices in some populare CI/CD tools (github/gitlab/jenkins) with a touch of GitOps.


February 01, 2023

More Decks by djalal

Other Decks in Technology


  1. Agenda • coding like on my laptop • building anything

    and testing by hand • ignore cache and download the internet the production-like misunderstanding blindly include secrets and vulnerabilities tags are mutable • health checks can kill your production… • multiple processes in same container • security as a second thought
  2. Dockerfail #1: coding like my laptop ➔ Signs of Dockerfail

    ◆ One COPY to rule them all ( COPY . . ) ◆ Duplicate “COPY” lines ◆ Consecutive “RUN” ➔ Better Dockerfile ◆ use a linter that will check syntax: hadolint ◆ Hadolint packs dozens of syntax rules ◆ Bonus: Enable a pre-commit hook in git to protect coworkers from the same branch
  3. Dockerfail #2: building anything and testing by hand ➔ Signs

    of Dockerfail ◆ Installs unwanted packages in final image ◆ Does not have a clue when critical versions change ◆ Oftentimes, the build is successful with incorrect content. ➔ Better Dockerfile ◆ Write a test suite, using container-test-tool by google • Assert expected content, • But also assert unwanted content ◆ Bonus: run as a pre commit git hook
  4. Dockerfail #3: ignore cache and download the internet [at your

    own risk] ➔ Signs of Dockerfail ◆ Each build downloads hundred of packages ◆ Dependencies are never cached ◆ It takes minutes to rebuild between code patches ➔ Better Dockerfile ◆ Add content in docker image from most static to most dynamic ◆ Split COPY and RUN instructions by distance source: • from internet • from internal servers • from Code from internal repo Example: OS -> system packages -> maven dependencies -> internal tooling -> code
  5. Dockerfail #4: the production-like misunderstanding ➔ Signs of Dockerfail ◆

    Dockerfile.dev vs Dockerfile.prod in the same repo ◆ Dirty Hacks in a Dockerfile with shell conditions like • RUN if [$IS_DEV] apt-get install -yq xdebug ➔ Better Dockerfile ◆ Multi stage builds ◆ Using per env profiles
  6. Dockerfail #5: blindly include secrets and vulnerabilities ➔ Signs of

    Dockerfail ◆ “Oops moments” when you inadvertently find “sensitive content” in docker images ◆ Like all silent bugs, it’s hard to detect by peer reviews, does not hurt until it’s too late ◆ Deleting from previous layer doesn’t actually delete it from docker image! ➔ Better dockerfile ◆ Using the multi stage build to scan for secrets and fail builds as soon as possible. ◆ Use built in secret management functionality ◆ Stop CI pipeline from pushing such images in docker registry
  7. Dockerfail #6: tags are mutable, changes get lost ➔ Signs

    of Dockerfail ◆ Using latest, prod or “stable” tag to deploy in production. ◆ not having a 1-1 relation from code to docker images • (“what git SHA1 is running in production?”) ➔ Better Dockerfile ◆ Collect docker image content, via docker sbom new CLI command (experimental) ◆ Use a git repository as an auditing space for Software Bill of Materials ◆ Multi tag and label each docker image (git SHA1, build number, timestamp, etc.) • Bonus: some Docker registries block tag reuse,, enable it if you can!
  8. Dockerfail #7: health checks can kill your production ➔ Signs

    of Dockerfail ◆ Uncalled restarts ◆ Slowness in changes ◆ Orchestrator confusion, moving pods around for no reason ➔ Better Dockerfile ◆ take time to design a real-world healthcheck • Start by a readiness probe (HEALTHCHECK instruction in Dockerfile) • Add a livenessProbe • Slow start? Add a startupProbe, with the new K8s v 1.19 • Use observability data to adjust the many per-probe settings
  9. Dockerfail #8: multiple processes in same container/pod ➔ Signs of

    Dockerfail ◆ Installing supervisord / systemd process managers ◆ Installing a SSH server in a Dockerfile ◆ Scaling the world! (tightly coupled components) ➔ Better Dockerfile ◆ Split the app into scoped services: frontend, api, backend, cache, storage, etc. ◆ Use one Dockerfile per service ◆ Manage them all with the proper abstraction: docker-compose.yml, kustomize specs, Helm chart, etc.
  10. Dockerfail #9: security as a second thought ➔ Signs of

    Dockerfail ◆ Vulnerabilities (obviously) ◆ DAST and SAST tools on fire! ➔ Better Dockerfile ◆ Scan production images in a container ◆ Drop all kernel capabilities, and enable them one by one as needed ◆ Run containers as non-root users, and in read-only mode, writing only in explicitly declared volumes ◆ Run container runtime (e.g containerd) in user namespace, and not as root daemon
  11. Dockerfail #10: You probably don’t need a Dockerfile! ➔ How

    to build an OCI compliant image ◆ Docker build ◆ Buildkit (Docker) ◆ BuildPack ◆ Jib (Google) ◆ kaniko (Google) ◆ orca-build (Aleksa Sarai) ◆ img (Jessie Frazelle) ◆ buildah (RedHat) ◆ umoci (SuSE) ◆ Bazel (Google) ◆ S2I (RedHat) ◆ Package (metaparticle) ◆ Systemd-nspawn ◆ LXC
  12. @enlamp @laytoun Thanks - Kiitos • Questions? Resources ➔ Code

    source ◆ https://github.com/djalal/dockerfail ➔ Links ◆ Awesome-docker ◆ Play-with-docker TL;DR – 10 “Dockerfine” 1. Linting 2. Testing 3. Caching 4. Multi Staging 5. Secrets 6. Deploying 7. Monitoring 8. Scaling 9. Scanning/Signing 10. Adapting