An Empirical Analysis of the Docker Container Ecosystem on GitHub
Paper on the Docker container ecosystem on GitHub presented by Jürgen Cito and Gerald Schermann at the International Conference on Mining Software Repositories (MSR'17), co-located with ICSE'17, in Buenos Aires, Argentina
Credits: Astrid Westvang, https://flic.kr/p/pWJLCW An Empirical Analysis of the Docker Container Ecosystem on GitHub Jürgen Cito, Gerald Schermann, E. Wittern, P. Leitner, S. Zumberi, H. C. Gall @citostyle @sh3llcat
package an application with all of its dependencies into a standardized unit for software development* Containers consist of everything that enables software to run: > Code > Runtime > System Tools > System Libraries * https://www.docker.com/what-docker
sure you have the redis source code checked out in # the same directory as this Dockerfile FROM ubuntu:12.04 MAINTAINER dockerfiles http:// dockerfiles.github.io RUN echo "deb http://archive.ubuntu.com/ ubuntu precise main universe" > /etc/apt/ sources.list RUN apt-get update RUN apt-get upgrade -y RUN apt-get install -y gcc make g++ build- essential libc6-dev tcl wget RUN wget http://download.redis.io/redis- stable.tar.gz -O - | tar -xvz # RUN tar -zvzf /redis/redis-stable.tar.gz RUN (cd /redis-stable && make) RUN (cd /redis-stable && make test) RUN mkdir -p /redis-data VOLUME ["/redis-data"] EXPOSE 6379 ENTRYPOINT ["/redis-stable/src/redis- server"] CMD ["--dir", "/redis-data"] Dockerfile build Image Docker Image Docker Container run
"deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/ sources.list RUN apt-get update RUN apt-get upgrade -y RUN apt-get install -y gcc make g++ build-essential libc6-dev tcl wget RUN sudo -E pip install scipy:0.18.1 # RUN tar -zvzf /redis/redis-stable.tar.gz RUN (cd /redis-stable && make) RUN (cd /redis-stable && make test) ADD redis.conf /var/www/redis.conf RUN mkdir -p /redis-data VOLUME ["/redis-data"] EXPOSE 6379 ENTRYPOINT ["/redis-stable/src/redis-server"] CMD ["--dir", "/redis-data"] Dependencies Base Image Install Open Port Start Server Volume Base Image can be an OS (Ubuntu) or a different, existing image Runs commands as if you were typing them in the command line Copies local files from build context into container Defines the infrastructure and dependencies of a container through instructions
echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/ sources.list RUN apt-get update RUN apt-get upgrade -y RUN apt-get install -y gcc make g++ build-essential libc6-dev tcl wget RUN sudo -E pip install scipy:0.18.1 # RUN tar -zvzf /redis/redis-stable.tar.gz RUN (cd /redis-stable && make) RUN (cd /redis-stable && make test) ADD redis.conf /var/www/redis.conf RUN mkdir -p /redis-data VOLUME ["/redis-data"] EXPOSE 6379 ENTRYPOINT ["/redis-stable/src/redis-server"] CMD ["--dir", "/redis-data"] Dependencies Base Image Install Open Port Start Server Volume Dockerfile Linter to check adherence to best practices Image Version Pinning missing (DL3006,Dl3007) Version Pinning on Dependencies (DL3008,Dl3013) ADD instead of COPY (DL3032)
1 n Rule Violation violates n 1 Instruction contains n 1 Diff before 1 1 after 1 1 Structured Change contains n 1 Change Type has n 1 Parameter Fine Grained Relational Model Dockerfile Linter
n 1 Instruction contains n 1 Diff before 1 1 after 1 1 Structured Change contains n 1 Change Type has n 1 Parameter All (70197) Dockerfiles on GitHub 218259 Revisions 1483763 changes 260829 violations
alpine java nginx ruby scratch php fedora busybox 0 5 10 15 20 25 % of Projects with Base Image Referenced in FROM Statements All Top−100 Top−1000 OS Runtime Application ~60% ~30% ~5%
nginx ruby scratch php fedora busybox 0 5 10 15 20 25 % of Projects with Base Image Referenced in FROM Statements All Top−100 Top−1000 Base Images & Sizes 125 MB 195 MB 4 MB
check adherence to best practices https://github.com/lukasmartinelli/hadolint Version Pinning Image pip apt-get :latest FROM ubuntu:12.04 FROM ubuntu RUN pip install django RUN pip install django==1.9 RUN apt-get install python RUN apt-get install python=2.7 FROM ubuntu:latest FROM ubuntu:12.04 Copy vs. Add
Paper https://peerj.com/preprints/2905/ Online Appendix (Dataset, Scripts, Analyses, Plots) https://github.com/sealuzh/docker-ecosystem-paper Jürgen Cito, Gerald Schermann, Erik Wittern, Philipp Leitner, Sali Zumberi, Harald C. Gall @citostyle @sh3llcat Empirical Analysis of Ecosystem, Quality and Standards Compliance, Evolution Tools docker-parser: https://github.com/sealuzh/dockerparser dockolution: https://github.com/sealuzh/dockolution