Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A story of migration from Docker Swarm to Kubernetes

A story of migration from Docker Swarm to Kubernetes

By Shigeyuki Fujishima (LINE Fukuoka)
At CloudNative Days Fukuoka 2019 #CND2019
https://cloudnativedays.jp/cndf2019/

LINE Developers

April 16, 2019
Tweet

More Decks by LINE Developers

Other Decks in Technology

Transcript

  1. A story of migration from Docker Swarm to Kubernetes Shigeyuki

    Fujishima LINE Fukuoka CLOUDNATIVE DAYS FUKUOKA 2019
  2. Introduction - I am • LINE 2015.04 - 2017.08 •

    LINE Fukuoka 2017.09 - Present • Experienced ◦ Develop/Operate In-house deployment/monitoring system ◦ DevOps/SRE-ish role • Current ◦ A member of private cloud "Verda" developers in LINE group 2
  3. Introduction - “Verda” is 3 • Name of the private

    cloud service in LINE • Helpful services for LINE developers • Many kind of as-a-service ◦ EC2-like VM / Bare metal service ◦ Object Storage/CDN ◦ Databases (MySQL, Elasticsearch, Redis) ◦ Load balancer ◦ Kubernetes ◦ Heroku-ish service ◦ Functions a.k.a Serverless ◦ And more
  4. • From developer’s view ◦ Simple load balancer architecture ▪

    Developers had “toils” to control the configurations. • e.g, URL routing • Update Server certs • Their own layer 7 for their own. Project Overview - Past in the day 4
  5. Project Overview - Past in the day • From layer4

    LB’s view ◦ Huge TCP connection ◦ Shared resource ◦ Active-Standby ◦ Hardware load balancer 5
  6. Project Overview - Past in the day • From layer

    4LB’s view ◦ Huge TCP connection 6
  7. Project Overview - Past in the day • From L4LB’s

    view ◦ Huge TCP connection ▪ Stateful session ▪ Session table shortage 7
  8. Project Overview - Past in the day • From L4LB’s

    view ◦ Shared resource ▪ Noisy neighbor ▪ Cascading failure 8
  9. Project Overview - Past in the day • From L4LB’s

    view ◦ Active-Standby ▪ 2N: Always double 9
  10. Project Overview - Past in the day • From L4LB’s

    view ◦ Hardware load balancer ▪ Getting outdated 10
  11. Project Overview - Problems to be solved • Scalability ◦

    Easy to scale out / in • Flexiblility ◦ Easy to update/upgrade • Isolation ◦ Limit a failure domain 11
  12. Project Story - Phase 1. Docker Swarm - Overview •

    Started on 2016 • Software-based load balancer • Containerized by Docker ◦ Linux namespace / cgroup • Orchestration by Docker Swarm ◦ Standalone mode (not swarm mode) ◦ Low cost of learning and development • Packet processing by XDP on L4 ◦ ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作す るに至ったのか〜 13
  13. Project Story - Phase 1. Docker Swarm - Result •

    Docker / Docker Swarm give us ◦ Scalability ▪ Container technology ◦ Flexibility ▪ Good APIs ◦ Isolation ▪ Linux namespace and cgorup ...Everything goes well? 15
  14. Project Story - Phase 1. New problems 16 • Docker

    Swarm ◦ Auto-Scalability of container ▪ No support out of the box ◦ Docker Integrated with Kubernetes ▪ Docker captain said “Swarm is alive and well.” but… • Implementation ◦ 1VIP in 1Container ◦ Resource efficiency issue again
  15. Project Story - Problems to be solved in next phase

    • Enable to auto-scale • Migration from Docker Swarm to Kubernetes • Better accommodations ◦ Put VIPs in a single server much more 17
  16. Project Story - Phase 2. Kubernetes - #1 • Put

    VIPs in a single container as many as possible ◦ Configure like as Virtual Host • Noisy neighbor? ◦ Deploy many pods in low cost machine ◦ Incoming traffic is supposed to be balnaced 18 Before After
  17. Project Story - Phase 2. Kubernetes - #2 • Auto-scale

    ◦ In progress... • Difficulties ◦ Handle unpredicted situation ◦ Lightning fast scaling ◦ “Graceful” shutdown ▪ Graceful…? • Keep connection even when the Pod is going down • Share communication resources such as socket…? 19
  18. Project Story - Phase 2. Kubernetes - #3 • Re-configure

    network • NAPT-less networking using Calico 20
  19. • One of official CNI ◦ Used by ▪ Yahoo!

    Japan ▪ Google Cloud Platform • “Pure” L3 network ◦ No overlay ◦ BGP based IP routing • Direct reachability to Pods from out of cluster Project Story - Phase 2. Kubernetes - Calico 21
  20. Project Story - Phase 2. Kubernetes - Present • (Solved)

    Better accommodations ◦ Virtual Host approach • (In progress) Enable to auto-scale ◦ Need more time... • (Solve) Migration from Docker Swarm to Kubernetes ◦ Direct connectivity to Pods by Calico • New challenges ◦ Intelligent resource scheduling ◦ Better communication between L4 and L7 ◦ Graceful upstream draining ◦ and more... 22
  21. Wrap Up • From Legacy to Container ◦ Scalability ◦

    Flexibility ◦ Isolation • From Docker Swarm to Kubernetes ◦ Cloudnative networking by Calico ◦ Possibility of advanced features ▪ Auto-scaling ▪ Intelligent resource scheduling 23
  22. Side Story: How we implement L7 networking on k8s •

    Pod is not visible from outside of the cluster ◦ Because nobody know Pod IP address. 24
  23. Side Story: How we implement L7 networking on k8s •

    Calico can advertises their IP address with BGP. 25
  24. Appendix - Publication • LINE's Infrastructure Platform: How It Scales

    Massive Services and Maintains Low Operational Cost • 自作ロードバランサー開発 • ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作するに 至ったのか〜 • LINE Engineerを支える CaaS基盤の今とこれから 27