Modern Linux Server with Containers

Modern Linux Server with Containers

Slides for my talk "Modern Linux Server with Containers" at LinuxCon 2013 Video: www.youtube.com/watch?v=ZD7HDrtkZoI

2786cdedd6e0eaa34b64b17e1cea81b9?s=128

Brandon Philips

September 21, 2013
Tweet

Transcript

  1. Who I Think You Are Software engineer, Sysadmin, etc who

    is... • wanting to learn about namespaces and cgroups • intereseted in containers and how they work • loves turtles (optional) Saturday, September 21, 13
  2. Modern Linux Server with Containers brandon.philips@coreos.com Saturday, September 21, 13

  3. Overview Saturday, September 21, 13

  4. Overview • System Designs Saturday, September 21, 13

  5. Overview • System Designs • Namespaces Saturday, September 21, 13

  6. Overview • System Designs • Namespaces • Cgroups Saturday, September

    21, 13
  7. Overview • System Designs • Namespaces • Cgroups • Tooling

    Saturday, September 21, 13
  8. The Spectrum Saturday, September 21, 13

  9. Saturday, September 21, 13

  10. Hypervisor Saturday, September 21, 13

  11. Container Hypervisor Saturday, September 21, 13

  12. Container Application Container Hypervisor Saturday, September 21, 13

  13. WARNING Saturday, September 21, 13

  14. Saturday, September 21, 13

  15. Saturday, September 21, 13

  16. Saturday, September 21, 13

  17. Saturday, September 21, 13

  18. Saturday, September 21, 13

  19. Saturday, September 21, 13

  20. System Designs Saturday, September 21, 13

  21. Saturday, September 21, 13

  22. Hypervisor Saturday, September 21, 13

  23. Hypervisor • Host provides full hardware environment Saturday, September 21,

    13
  24. Hypervisor • Host provides full hardware environment • Block device,

    ethernet device, etc Saturday, September 21, 13
  25. Hypervisor • Host provides full hardware environment • Block device,

    ethernet device, etc • Guests run a full kernel Saturday, September 21, 13
  26. Saturday, September 21, 13

  27. Container Saturday, September 21, 13

  28. Container • Host provides Kernel Saturday, September 21, 13

  29. Container • Host provides Kernel • Filesystem, network interface, etc

    are already there Saturday, September 21, 13
  30. Container • Host provides Kernel • Filesystem, network interface, etc

    are already there • Guest starts from /sbin/init Saturday, September 21, 13
  31. Saturday, September 21, 13

  32. Application Container Saturday, September 21, 13

  33. Application Container • Host provides Kernel Saturday, September 21, 13

  34. Application Container • Host provides Kernel • User data, socket

    fd, etc are already there Saturday, September 21, 13
  35. Application Container • Host provides Kernel • User data, socket

    fd, etc are already there • Starts from application not init Saturday, September 21, 13
  36. Namespaces Saturday, September 21, 13

  37. Imagine: cool medieval castle photo *perhaps fog rolling in* Saturday,

    September 21, 13
  38. Filesystem Saturday, September 21, 13

  39. Filesystem • Read-only Saturday, September 21, 13

  40. Filesystem • Read-only • Shared Saturday, September 21, 13

  41. Filesystem • Read-only • Shared • Slave Saturday, September 21,

    13
  42. Filesystem • Read-only • Shared • Slave • Private Saturday,

    September 21, 13
  43. Read-only Saturday, September 21, 13

  44. Private bind mount before: after: source/a-file bind/a-file mount -t tmpfs

    -o size=1M tmpfs source/mnt before: after: source/mnt/tmpfs-file mount -t tmpfs -o size=1M tmpfs bind/mnt2 before: after: bind/mnt2/mnt2-file Saturday, September 21, 13
  45. Shared bind mount before: after: source/a-file bind/a-file mount -t tmpfs

    -o size=1M tmpfs source/mnt before: after: source/mnt/tmpfs-file bind/mnt/tmpfs-file mount -t tmpfs -o size=1M tmpfs bind/mnt2 before: after: source/mnt2/mnt2-file bind/mnt2/mnt2-file Saturday, September 21, 13
  46. Slave bind mount before: after: source/a-file bind/a-file mount -t tmpfs

    -o size=1M tmpfs source/mnt before: after: source/mnt/tmpfs-file bind/mnt/tmpfs-file mount -t tmpfs -o size=1M tmpfs bind/mnt2 before: after: bind/mnt2/mnt2-file Saturday, September 21, 13
  47. Patterns • Mounting RO /usr inside a container • Private

    /tmp per service • Sharing data across containers via binds Saturday, September 21, 13
  48. Networking Saturday, September 21, 13

  49. Networking • Root namespace Saturday, September 21, 13

  50. Networking • Root namespace • Bridging Saturday, September 21, 13

  51. Networking • Root namespace • Bridging • Private namespace with

    socket activation Saturday, September 21, 13
  52. Root Namespace • Full access to the machine interfaces Saturday,

    September 21, 13
  53. Root Namespace Saturday, September 21, 13

  54. Root Namespace • Advantages Saturday, September 21, 13

  55. Root Namespace • Advantages • Fast Saturday, September 21, 13

  56. Root Namespace • Advantages • Fast • Easy to get

    setup Saturday, September 21, 13
  57. Root Namespace • Advantages • Fast • Easy to get

    setup • Network looks normal to the container Saturday, September 21, 13
  58. Root Namespace • Advantages • Fast • Easy to get

    setup • Network looks normal to the container Saturday, September 21, 13
  59. Root Namespace • Advantages • Fast • Easy to get

    setup • Network looks normal to the container • Disadvatages Saturday, September 21, 13
  60. Root Namespace • Advantages • Fast • Easy to get

    setup • Network looks normal to the container • Disadvatages • No separation of concerns Saturday, September 21, 13
  61. Root Namespace • Advantages • Fast • Easy to get

    setup • Network looks normal to the container • Disadvatages • No separation of concerns • Container has full control Saturday, September 21, 13
  62. Network Bridges Saturday, September 21, 13

  63. Network Bridges • Create a bridge, like a virtual switch

    Saturday, September 21, 13
  64. Network Bridges • Create a bridge, like a virtual switch

    • Create container namespace and add interface Saturday, September 21, 13
  65. Network Bridges • Create a bridge, like a virtual switch

    • Create container namespace and add interface • Attach container interface to bridge Saturday, September 21, 13
  66. Network Bridges Saturday, September 21, 13

  67. Network Bridges • Advantages Saturday, September 21, 13

  68. Network Bridges • Advantages • More complex to get setup

    Saturday, September 21, 13
  69. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container Saturday, September 21, 13
  70. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container Saturday, September 21, 13
  71. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container Saturday, September 21, 13
  72. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container • Disadvantages Saturday, September 21, 13
  73. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container • Disadvantages • Less speed Saturday, September 21, 13
  74. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container • Disadvantages • Less speed • NAT to the internet Saturday, September 21, 13
  75. Network Bridges • Advantages • More complex to get setup

    • Network looks normal to the container • Disadvantages • Less speed • NAT to the internet • iptables to expose public socket Saturday, September 21, 13
  76. Socket Activation Saturday, September 21, 13

  77. Socket Activation • No interface Saturday, September 21, 13

  78. Socket Activation • No interface • Sockets are passed via

    stdin (inetd) Saturday, September 21, 13
  79. Socket Activation • No interface • Sockets are passed via

    stdin (inetd) • systemd style listen fd API Saturday, September 21, 13
  80. inetd style Saturday, September 21, 13

  81. inetd style • Advantages Saturday, September 21, 13

  82. inetd style • Advantages • Fast and isolated Saturday, September

    21, 13
  83. inetd style • Advantages • Fast and isolated • Simple

    and well understood Saturday, September 21, 13
  84. inetd style • Advantages • Fast and isolated • Simple

    and well understood • Support from existing daemons like ssh Saturday, September 21, 13
  85. inetd style • Advantages • Fast and isolated • Simple

    and well understood • Support from existing daemons like ssh • No process running until needed Saturday, September 21, 13
  86. inetd style • Advantages • Fast and isolated • Simple

    and well understood • Support from existing daemons like ssh • No process running until needed • Disadvantages Saturday, September 21, 13
  87. inetd style • Advantages • Fast and isolated • Simple

    and well understood • Support from existing daemons like ssh • No process running until needed • Disadvantages • One process per client (scaling problems!) Saturday, September 21, 13
  88. listen fd style Saturday, September 21, 13

  89. listen fd style • Advantages Saturday, September 21, 13

  90. listen fd style • Advantages • Fast and isolated Saturday,

    September 21, 13
  91. listen fd style • Advantages • Fast and isolated •

    Only one process needed per service Saturday, September 21, 13
  92. listen fd style • Advantages • Fast and isolated •

    Only one process needed per service • No process running until needed Saturday, September 21, 13
  93. listen fd style • Advantages • Fast and isolated •

    Only one process needed per service • No process running until needed Saturday, September 21, 13
  94. listen fd style • Advantages • Fast and isolated •

    Only one process needed per service • No process running until needed • Disadvantages Saturday, September 21, 13
  95. listen fd style • Advantages • Fast and isolated •

    Only one process needed per service • No process running until needed • Disadvantages • Patches required to daemons Saturday, September 21, 13
  96. Process Namespace • PID 1 is something else outside the

    namespace Saturday, September 21, 13
  97. All the Rest Saturday, September 21, 13

  98. Cgroups Saturday, September 21, 13

  99. Imagine: an accountant’s overflowing desk perhaps hands on head in

    dispair Saturday, September 21, 13
  100. Block I/O • Limit: Weight from 10 to1000 • Limit:

    Bandwidth limits R/W • Metrics: iops serviced, waiting and queued Saturday, September 21, 13
  101. CPU • Limit: Shares system 1024 is half of 2048

    •Metrics: cpuacct.stats user and system Saturday, September 21, 13
  102. • Limit: Total RSS memory limit • Metrics: swap, total

    rss, # page ins/outs Memory Saturday, September 21, 13
  103. Tooling Saturday, September 21, 13

  104. docker Saturday, September 21, 13

  105. nspawn Saturday, September 21, 13

  106. nsenter Saturday, September 21, 13

  107. /sys/fs/cgroup Saturday, September 21, 13

  108. systemd units Saturday, September 21, 13

  109. systemd-cgtop Saturday, September 21, 13

  110. Recap Saturday, September 21, 13

  111. Recap • Containers are built on namespaces and cgroups Saturday,

    September 21, 13
  112. Recap • Containers are built on namespaces and cgroups •

    Namespaces provide isolation similar to hypervisors Saturday, September 21, 13
  113. Recap • Containers are built on namespaces and cgroups •

    Namespaces provide isolation similar to hypervisors • Cgroups provide resource limiting and accounting Saturday, September 21, 13
  114. Recap • Containers are built on namespaces and cgroups •

    Namespaces provide isolation similar to hypervisors • Cgroups provide resource limiting and accounting • These tools can be mixed to create hybrids Saturday, September 21, 13
  115. Future Saturday, September 21, 13

  116. Thanks! @BrandonPhilips @CoreOSLinux Saturday, September 21, 13