Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Software Circus Meetup: How to replace your engines mid-flight while maintaining a steady cruising altitude, or: how I learned to stop worrying and love Flatcar Container Linux

Software Circus Meetup: How to replace your engines mid-flight while maintaining a steady cruising altitude, or: how I learned to stop worrying and love Flatcar Container Linux

Installing Linux on a new PC isn’t that hard, right? How about replacing the operating system across a fleet of thousands of virtual machines across multiple clouds, and on-premises installations with major global enterprise customers relying on you to keep their infrastructure running 24x7? This was the challenge that faced Giant Swarm as they realized CoreOS Container Linux, on which their managed Kubernetes service was based, had been declared end of life — and decided to migrate to Flatcar Container Linux, with the help of Kinvolk. In this talk, we will explain the process Giant Swarm followed to decide on their migration strategy, the challenges of implementing it, and the lessons learned from the process.

7a1af5a69aeacaba5042ee2f332fdaf6?s=128

Andy Randall

October 21, 2020
Tweet

Transcript

  1. @andrew_randall @kinvolkio @salvo_mazzy @giantswarm How to replace your engines mid-flight

    while maintaining a steady cruising altitude or: how I learned to stop worrying and love Flatcar Container Linux Software Circus Meetup 21 October 2020
  2. andy randall business development @ salvatore mazzarino site reliability engineer

    @ @andrew_randall @kinvolkio @salvo_mazzy @giantswarm Berlin Prague
  3. Who is Kinvolk berlin hq with globally distributed team community

    all systems go!, cloud-native rejekts, o4b, community days, … independent founded 2015, no external investors open source linux, kubernetes, oss consulting @your_twitter_handle
  4. “Secure the Internet” (2013) @andrew_randall @kinvolkio @salvo_mazzy @giantswarm

  5. automated, streamlined updates operational simplicity for management at scale easily

    apply all latest security patches rollback partition co-ordinated with k8s control plane (update operator) minimal distribution required for containers reduced dependencies less software to manage reduced attack surface area repeatable deployment without per-host scripting Why use a Container Linux? immutable file system operational simplicity for management at scale removes entire category of security threats - e.g. runc vulnerability cve-2019-5736* * See kinvolk.io/blog/2019/02/runc-breakout-vulnerability-mitigated-on-flatcar-linux @andrew_randall @kinvolkio @salvo_mazzy @giantswarm
  6. Gentoo ChromeOS CoreOS Container Linux minimal set of packages update

    mechanism @andrew_randall @kinvolkio @salvo_mazzy @giantswarm
  7. @andrew_randall @kinvolkio @salvo_mazzy @giantswarm

  8. None
  9. Gentoo ChromeOS CoreOS Container Linux minimal set of packages update

    mechanism Flatcar Container Linux @andrew_randall @kinvolkio @salvo_mazzy @giantswarm
  10. flatcar /ˈflatkɑː/ noun a railway freight wagon without a roof

    or sides, often used to transport shipping containers as part of intermodal freight shipping
  11. Taking Container Linux Forward continuous updates kernel → 5.4 (stable),

    5.8 (alpha) 25 releases since coreos eol 56 component packages updated focus on security 145 security vulnerabilities (CVEs) fixed Joined Linux Kernel Security Team open source update server created and open sourced update server (previously proprietary coreos offering) ambitious roadmap telemetry services, broader platform support, security/regulatory certifications, … NEW: flatcar pro for cloud optimized for microsoft azure (initially) with azure tuned kernel includes enterprise support @andrew_randall @kinvolkio @salvo_mazzy @giantswarm
  12. Embraced by the community supported wherever you deploy containers fast-growing

    installed base trusted by leading global enterprises
  13. “Flatcar Container Linux imho is the best distro for k8s

    clusters atm” – Darren Shepherd Co-founder / CTO, Rancher Labs
  14. What does Giant Swarm do? - 200+ clusters created &

    lifecycle managed by us - AWS / Azure / KVM (+ AWS China) - Many large clusters - Production ready (24/7 monitoring, day 2 ops, …)
  15. Glossary - Control Plane: Kubernetes cluster running Giant Swarm services

    - Provider: Third-party service (public cloud or on-premises) providing low-level primitives such as Virtualization, Networking, Data Storage - Tenant Cluster: On-demand Kubernetes cluster created by Giant Swarm’s customers running customers’ workloads
  16. Time to catch a new train Immutable infrastructure is an

    important part of making a container platform scalable and reliable. So having a really small OS, was and it is still important. Timo Derstappen CTO @Giant Swarm https://bit.ly/3lUnlSi
  17. Providers Amazon Web Services • 7 regions • 150+ clusters

    • 1700+ VMs Microsoft Azure • 2 regions • 30+ clusters • 200+ VMs KVM • 4 Enterprise Data Centers • 12+ clusters • 300+ VMs CoreOS Container Linux CoreOS Container Linux CoreOS Container Linux
  18. Platforms

  19. Amazon Web Services - AWS EC2 - AMIs provided by

    Kinvolk - AWS China - Smooth and Easy * * https://bit.ly/3kgifPN
  20. Microsoft Azure - Azure VM Scale Set - Publisher changed

    - Automation powers - More steps involved
  21. KVM - PXE Boot for Control Planes - QEMU Images

    for Tenant Clusters - Kernel params changes
  22. { "ignition": { "version": "2.2.0" }, "storage": { "disks": [

    ... ], "filesystems": [ ... ] }, "systemd": { "units": [ ... ] } } Ignition
  23. Customers - Same OS - Security ‍♀ - Release process

  24. Today

  25. Today

  26. Sponsorship

  27. What’s next? - Releases Conformance tests - New Hypervisors testing

    (Firecracker) - Virt improvements (kernel 5.4+) - Built-in Wireguard support
  28. Questions?