Who I Think You Are
Software engineer, Sysadmin, etc who is...
• wanting to learn about namespaces and
cgroups
• intereseted in containers and how they
work
• loves turtles (optional)
Saturday, September 21, 13
Slide 2
Slide 2 text
Modern Linux Server
with Containers
[email protected]
Saturday, September 21, 13
Slide 3
Slide 3 text
Overview
Saturday, September 21, 13
Slide 4
Slide 4 text
Overview
• System Designs
Saturday, September 21, 13
Slide 5
Slide 5 text
Overview
• System Designs
• Namespaces
Saturday, September 21, 13
Slide 6
Slide 6 text
Overview
• System Designs
• Namespaces
• Cgroups
Saturday, September 21, 13
Slide 7
Slide 7 text
Overview
• System Designs
• Namespaces
• Cgroups
• Tooling
Saturday, September 21, 13
Slide 8
Slide 8 text
The Spectrum
Saturday, September 21, 13
Slide 9
Slide 9 text
Saturday, September 21, 13
Slide 10
Slide 10 text
Hypervisor
Saturday, September 21, 13
Slide 11
Slide 11 text
Container
Hypervisor
Saturday, September 21, 13
Slide 12
Slide 12 text
Container
Application
Container
Hypervisor
Saturday, September 21, 13
Slide 13
Slide 13 text
WARNING
Saturday, September 21, 13
Slide 14
Slide 14 text
Saturday, September 21, 13
Slide 15
Slide 15 text
Saturday, September 21, 13
Slide 16
Slide 16 text
Saturday, September 21, 13
Slide 17
Slide 17 text
Saturday, September 21, 13
Slide 18
Slide 18 text
Saturday, September 21, 13
Slide 19
Slide 19 text
Saturday, September 21, 13
Slide 20
Slide 20 text
System Designs
Saturday, September 21, 13
Slide 21
Slide 21 text
Saturday, September 21, 13
Slide 22
Slide 22 text
Hypervisor
Saturday, September 21, 13
Slide 23
Slide 23 text
Hypervisor
• Host provides full hardware environment
Saturday, September 21, 13
Slide 24
Slide 24 text
Hypervisor
• Host provides full hardware environment
• Block device, ethernet device, etc
Saturday, September 21, 13
Slide 25
Slide 25 text
Hypervisor
• Host provides full hardware environment
• Block device, ethernet device, etc
• Guests run a full kernel
Saturday, September 21, 13
Slide 26
Slide 26 text
Saturday, September 21, 13
Slide 27
Slide 27 text
Container
Saturday, September 21, 13
Slide 28
Slide 28 text
Container
• Host provides Kernel
Saturday, September 21, 13
Slide 29
Slide 29 text
Container
• Host provides Kernel
• Filesystem, network interface, etc are
already there
Saturday, September 21, 13
Slide 30
Slide 30 text
Container
• Host provides Kernel
• Filesystem, network interface, etc are
already there
• Guest starts from /sbin/init
Saturday, September 21, 13
Slide 31
Slide 31 text
Saturday, September 21, 13
Slide 32
Slide 32 text
Application Container
Saturday, September 21, 13
Slide 33
Slide 33 text
Application Container
• Host provides Kernel
Saturday, September 21, 13
Slide 34
Slide 34 text
Application Container
• Host provides Kernel
• User data, socket fd, etc are already there
Saturday, September 21, 13
Slide 35
Slide 35 text
Application Container
• Host provides Kernel
• User data, socket fd, etc are already there
• Starts from application not init
Saturday, September 21, 13
Slide 36
Slide 36 text
Namespaces
Saturday, September 21, 13
Slide 37
Slide 37 text
Imagine: cool medieval castle photo
*perhaps fog rolling in*
Saturday, September 21, 13
Slide 38
Slide 38 text
Filesystem
Saturday, September 21, 13
Slide 39
Slide 39 text
Filesystem
• Read-only
Saturday, September 21, 13
Slide 40
Slide 40 text
Filesystem
• Read-only
• Shared
Saturday, September 21, 13
Private bind mount
before:
after:
source/a-file
bind/a-file
mount -t tmpfs -o size=1M tmpfs source/mnt
before:
after:
source/mnt/tmpfs-file
mount -t tmpfs -o size=1M tmpfs bind/mnt2
before:
after:
bind/mnt2/mnt2-file
Saturday, September 21, 13
Slide 45
Slide 45 text
Shared bind mount
before:
after:
source/a-file
bind/a-file
mount -t tmpfs -o size=1M tmpfs source/mnt
before:
after:
source/mnt/tmpfs-file
bind/mnt/tmpfs-file
mount -t tmpfs -o size=1M tmpfs bind/mnt2
before:
after:
source/mnt2/mnt2-file
bind/mnt2/mnt2-file
Saturday, September 21, 13
Slide 46
Slide 46 text
Slave bind mount
before:
after:
source/a-file
bind/a-file
mount -t tmpfs -o size=1M tmpfs source/mnt
before:
after:
source/mnt/tmpfs-file
bind/mnt/tmpfs-file
mount -t tmpfs -o size=1M tmpfs bind/mnt2
before:
after:
bind/mnt2/mnt2-file
Saturday, September 21, 13
Slide 47
Slide 47 text
Patterns
• Mounting RO /usr inside a container
• Private /tmp per service
• Sharing data across containers via binds
Saturday, September 21, 13
Slide 48
Slide 48 text
Networking
Saturday, September 21, 13
Slide 49
Slide 49 text
Networking
• Root namespace
Saturday, September 21, 13
Slide 50
Slide 50 text
Networking
• Root namespace
• Bridging
Saturday, September 21, 13
Slide 51
Slide 51 text
Networking
• Root namespace
• Bridging
• Private namespace with socket activation
Saturday, September 21, 13
Slide 52
Slide 52 text
Root Namespace
• Full access to the machine interfaces
Saturday, September 21, 13
Slide 53
Slide 53 text
Root Namespace
Saturday, September 21, 13
Slide 54
Slide 54 text
Root Namespace
• Advantages
Saturday, September 21, 13
Slide 55
Slide 55 text
Root Namespace
• Advantages
• Fast
Saturday, September 21, 13
Slide 56
Slide 56 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
Saturday, September 21, 13
Slide 57
Slide 57 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
• Network looks normal
to the container
Saturday, September 21, 13
Slide 58
Slide 58 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
• Network looks normal
to the container
Saturday, September 21, 13
Slide 59
Slide 59 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
• Network looks normal
to the container
• Disadvatages
Saturday, September 21, 13
Slide 60
Slide 60 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
• Network looks normal
to the container
• Disadvatages
• No separation of
concerns
Saturday, September 21, 13
Slide 61
Slide 61 text
Root Namespace
• Advantages
• Fast
• Easy to get setup
• Network looks normal
to the container
• Disadvatages
• No separation of
concerns
• Container has full
control
Saturday, September 21, 13
Slide 62
Slide 62 text
Network Bridges
Saturday, September 21, 13
Slide 63
Slide 63 text
Network Bridges
• Create a bridge, like a virtual switch
Saturday, September 21, 13
Slide 64
Slide 64 text
Network Bridges
• Create a bridge, like a virtual switch
• Create container namespace and add
interface
Saturday, September 21, 13
Slide 65
Slide 65 text
Network Bridges
• Create a bridge, like a virtual switch
• Create container namespace and add
interface
• Attach container interface to bridge
Saturday, September 21, 13
Slide 66
Slide 66 text
Network Bridges
Saturday, September 21, 13
Slide 67
Slide 67 text
Network Bridges
• Advantages
Saturday, September 21, 13
Slide 68
Slide 68 text
Network Bridges
• Advantages
• More complex to get
setup
Saturday, September 21, 13
Slide 69
Slide 69 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
Saturday, September 21, 13
Slide 70
Slide 70 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
Saturday, September 21, 13
Slide 71
Slide 71 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
Saturday, September 21, 13
Slide 72
Slide 72 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
• Disadvantages
Saturday, September 21, 13
Slide 73
Slide 73 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
• Disadvantages
• Less speed
Saturday, September 21, 13
Slide 74
Slide 74 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
• Disadvantages
• Less speed
• NAT to the internet
Saturday, September 21, 13
Slide 75
Slide 75 text
Network Bridges
• Advantages
• More complex to get
setup
• Network looks normal
to the container
• Disadvantages
• Less speed
• NAT to the internet
• iptables to expose public
socket
Saturday, September 21, 13
Slide 76
Slide 76 text
Socket Activation
Saturday, September 21, 13
Slide 77
Slide 77 text
Socket Activation
• No interface
Saturday, September 21, 13
Slide 78
Slide 78 text
Socket Activation
• No interface
• Sockets are passed via stdin (inetd)
Saturday, September 21, 13
Slide 79
Slide 79 text
Socket Activation
• No interface
• Sockets are passed via stdin (inetd)
• systemd style listen fd API
Saturday, September 21, 13
Slide 80
Slide 80 text
inetd style
Saturday, September 21, 13
Slide 81
Slide 81 text
inetd style
• Advantages
Saturday, September 21, 13
Slide 82
Slide 82 text
inetd style
• Advantages
• Fast and isolated
Saturday, September 21, 13
Slide 83
Slide 83 text
inetd style
• Advantages
• Fast and isolated
• Simple and well
understood
Saturday, September 21, 13
Slide 84
Slide 84 text
inetd style
• Advantages
• Fast and isolated
• Simple and well
understood
• Support from existing
daemons like ssh
Saturday, September 21, 13
Slide 85
Slide 85 text
inetd style
• Advantages
• Fast and isolated
• Simple and well
understood
• Support from existing
daemons like ssh
• No process running until
needed
Saturday, September 21, 13
Slide 86
Slide 86 text
inetd style
• Advantages
• Fast and isolated
• Simple and well
understood
• Support from existing
daemons like ssh
• No process running until
needed
• Disadvantages
Saturday, September 21, 13
Slide 87
Slide 87 text
inetd style
• Advantages
• Fast and isolated
• Simple and well
understood
• Support from existing
daemons like ssh
• No process running until
needed
• Disadvantages
• One process per client
(scaling problems!)
Saturday, September 21, 13
Slide 88
Slide 88 text
listen fd style
Saturday, September 21, 13
Slide 89
Slide 89 text
listen fd style
• Advantages
Saturday, September 21, 13
Slide 90
Slide 90 text
listen fd style
• Advantages
• Fast and isolated
Saturday, September 21, 13
Slide 91
Slide 91 text
listen fd style
• Advantages
• Fast and isolated
• Only one process
needed per service
Saturday, September 21, 13
Slide 92
Slide 92 text
listen fd style
• Advantages
• Fast and isolated
• Only one process
needed per service
• No process running until
needed
Saturday, September 21, 13
Slide 93
Slide 93 text
listen fd style
• Advantages
• Fast and isolated
• Only one process
needed per service
• No process running until
needed
Saturday, September 21, 13
Slide 94
Slide 94 text
listen fd style
• Advantages
• Fast and isolated
• Only one process
needed per service
• No process running until
needed
• Disadvantages
Saturday, September 21, 13
Slide 95
Slide 95 text
listen fd style
• Advantages
• Fast and isolated
• Only one process
needed per service
• No process running until
needed
• Disadvantages
• Patches required to
daemons
Saturday, September 21, 13
Slide 96
Slide 96 text
Process Namespace
• PID 1 is something else outside the
namespace
Saturday, September 21, 13
Slide 97
Slide 97 text
All the Rest
Saturday, September 21, 13
Slide 98
Slide 98 text
Cgroups
Saturday, September 21, 13
Slide 99
Slide 99 text
Imagine: an accountant’s overflowing desk
perhaps hands on head in dispair
Saturday, September 21, 13
Slide 100
Slide 100 text
Block I/O
• Limit: Weight from 10 to1000
• Limit: Bandwidth limits R/W
• Metrics: iops serviced, waiting and
queued
Saturday, September 21, 13
Slide 101
Slide 101 text
CPU
• Limit: Shares system 1024 is half of 2048
•Metrics: cpuacct.stats user and system
Saturday, September 21, 13
Slide 102
Slide 102 text
• Limit: Total RSS memory limit
• Metrics: swap, total rss, # page ins/outs
Memory
Saturday, September 21, 13
Slide 103
Slide 103 text
Tooling
Saturday, September 21, 13
Slide 104
Slide 104 text
docker
Saturday, September 21, 13
Slide 105
Slide 105 text
nspawn
Saturday, September 21, 13
Slide 106
Slide 106 text
nsenter
Saturday, September 21, 13
Slide 107
Slide 107 text
/sys/fs/cgroup
Saturday, September 21, 13
Slide 108
Slide 108 text
systemd units
Saturday, September 21, 13
Slide 109
Slide 109 text
systemd-cgtop
Saturday, September 21, 13
Slide 110
Slide 110 text
Recap
Saturday, September 21, 13
Slide 111
Slide 111 text
Recap
• Containers are built on namespaces and
cgroups
Saturday, September 21, 13
Slide 112
Slide 112 text
Recap
• Containers are built on namespaces and
cgroups
• Namespaces provide isolation similar to
hypervisors
Saturday, September 21, 13
Slide 113
Slide 113 text
Recap
• Containers are built on namespaces and
cgroups
• Namespaces provide isolation similar to
hypervisors
• Cgroups provide resource limiting and
accounting
Saturday, September 21, 13
Slide 114
Slide 114 text
Recap
• Containers are built on namespaces and
cgroups
• Namespaces provide isolation similar to
hypervisors
• Cgroups provide resource limiting and
accounting
• These tools can be mixed to create hybrids
Saturday, September 21, 13
Slide 115
Slide 115 text
Future
Saturday, September 21, 13
Slide 116
Slide 116 text
Thanks!
@BrandonPhilips
@CoreOSLinux
Saturday, September 21, 13