Moving from Mesos to Kubernetes without anyone noticing

26896287bc831a13e768cea7efe29632?s=47 Anubhav Mishra
December 10, 2017

Moving from Mesos to Kubernetes without anyone noticing

At Hootsuite, we’ve been using Mesos and Marathon as our microservices platform for over two years but last year, we made the decision to bet on Kubernetes as its replacement. Eight months later, a small team of three operations engineers had migrated our first microservice from Mesos to Kubernetes. All without developers making any code changes. This was possible by architecting our applications with the proper set of abstractions. Fast-forward three months later and we have almost 20 microservices running on Kubernetes in production.

In this session, we’ll do a live demo of migrating a service from Mesos to Kubernetes, just like how we did it at Hootsuite! We will cover why architecting your infrastructure with the “right” abstractions helps you do these huge migrations with ease and how Kubernetes already contains these abstractions. We will explore how having a service mesh helps routing between two platforms while doing the migration. Also, how a mature CI/CD pipeline can help you deploy to two platforms with ease. To conclude we will explore the differences in running a service in Mesos and Kubernetes.

26896287bc831a13e768cea7efe29632?s=128

Anubhav Mishra

December 10, 2017
Tweet

Transcript

  1. Moving from Mesos to Kubernetes without anyone noticing* Anubhav Mishra

  2. Anubhav Mishra @anubhavm

  3. Anubhav Mishra @anubhavm

  4. None
  5. Anubhav Mishra @anubhavm

  6. Anubhav Mishra @anubhavm Atlantis

  7. Anubhav Mishra @anubhavm Atlantis

  8. Anubhav Mishra @anubhavm Atlantis

  9. Anubhav Mishra @anubhavm Atlantis

  10. vs

  11. vs

  12. Agenda • Hootsuite’s Journey from Mesos to Kubernetes • Microservices

    pipeline ◦ Mesos and Marathon ◦ Kubernetes • Migration without major disruption • Live demo! • Lessons learned/Conclusion
  13. Hootsuite Now

  14. Numbers • 120+ developers • 60+ microservices • 2 cluster

    schedulers • 1500+ servers on AWS
  15. 2014

  16. None
  17. I want to build a microservice

  18. I want to build a microservice

  19. I want to build a microservice Oh! A “microservice”? Hmm..

    seems to be the new thing huh. Yep, just create a JIRA ticket.
  20. Minutes later….

  21. None
  22. None
  23. None
  24. Weeks later….

  25. Here are your servers! Well, that took a while!

  26. Ok! Now I only need Java, Sensu checks and a

    Jenkins pipeline top deploy to the servers
  27. None
  28. 2016-2017

  29. None
  30. I want to build a microservice

  31. 5 minutes later….

  32. I just deployed a microservice to production!

  33. Microservice Pipeline ./project-generator Pipeline as Code Mesos Marathon

  34. Project Skeleton

  35. = Project Skeleton ./project-generator

  36. Pipeline as Code

  37. Pipeline as Code

  38. None
  39. Packaging

  40. Deployment Files replicas: 1 resources: cpu: 2 memory: 200M healthChecks:

    ...
  41. Makefile • make deploy-dev • make deploy-staging • make deploy-production

  42. Mesos Marathon

  43. API make deploy

  44. API make deploy POST { "id":"service-1", "cpus": 0.1, "mem": 10.0,

    "instances": 1 }
  45. API make deploy POST { "id":"service-1", "cpus": 0.1, "mem": 10.0,

    "instances": 1 } service-1
  46. Routing service-1 10.0.10.1 service-2 10.0.10.2

  47. Routing service-1 10.0.10.1 service-2 10.0.10.2 ?

  48. Routing - Fat Middleware service-1 10.0.10.1 service-2 10.0.10.2

  49. curl http://localhost:5040/service/service-1/endpoint { upstream service-1 { server 10.0.10.1:5041; .... }

    } localhost:5040 service-2 10.0.10.2
  50. { upstream service-1 { server 10.0.10.1:5041; .... } } service-2

    10.0.10.2 curl https://10.0.10.1:5041/service/service-1/endpoint service-1 10.0.10.1
  51. { upstream service-1 { server 10.0.10.1:5041; .... } } service-2

    10.0.10.2 service-1 10.0.10.1 localhost:8080
  52. None
  53. None
  54. None
  55. None
  56. Why Kubernetes?

  57. 4 months x

  58. None
  59. ?

  60. Microservices on Mesos and Marathon • Project Skeleton ◦ Golang

    or Scala • Pipeline as Code ◦ Jenkinsfile ◦ Makefile • Docker images for packaging • API on top of Marathon • Dynamic service discovery • Fat middleware using Consul and NGINX
  61. Microservices on Kubernetes • Project Skeleton ◦ Golang or Scala

    • Pipeline as Code ◦ Jenkinsfile ◦ Makefile • Docker images for packaging • API on top of Marathon • Dynamic service discovery • Fat middleware using Consul and NGINX • Documentation for getting started • ./mesos2k8s
  62. ./mesos2k8s

  63. Deployment Files • make deploy-k8s-dev • make deploy-k8s-staging • make

    deploy-k8s-production
  64. Pipeline as Code

  65. Pipeline as Code

  66. Pipeline as Code

  67. Packaging

  68. Packaging

  69. Routing in Mesos service-1 10.0.10.1 service-2 10.0.10.2

  70. Routing in Mesos service-2 10.0.10.2

  71. Routing to K8s service-2 10.0.10.2 service-1 10.0.17.1 172.10.0.10 10.0.30.10

  72. service-2 curl http://localhost:5040/service/service-1/endpoint { upstream service-1 { } upstream bridge-1

    { server 10.0.20.1:5041; .... } } localhost:5040
  73. service-2 { upstream service-1 { } upstream bridge-1 { server

    10.0.20.1:5041; .... } } bridge-1 (multi-dc aware)
  74. service-2 curl http://localhost:5040/service/service-1/endpoint { upstream service-1 { } upstream bridge-1

    { server 10.0.20.1:5041; .... } } curl https://10.0.20.1:5041/service/service-1/endpoint bridge-1 (multi-dc aware)
  75. service-2 curl http://localhost:5040/service/service-1/endpoint { upstream service-1 { } upstream bridge-1

    { server 10.0.20.1:5041; .... } } curl https://10.0.30.1:5041/service/service-1/endpoint bridge-1 (multi-dc aware)
  76. service-2 10.0.10.2 service-1 10.0.17.1 172.10.0.10 10.0.30.10 curl https://10.0.30.1:5041/service/service-1/endpoint

  77. service-2 10.0.10.2 service-1 10.0.17.1 172.10.0.10 10.0.30.10 curl https://10.0.30.1:5041/service/service-1/endpoint http://service-1.default.svc.cluster.local:8080

  78. service-2 10.0.10.2 service-1 10.0.17.1 172.10.0.10 10.0.30.10 curl https://10.0.30.1:5041/service/service-1/endpoint

  79. service-2 10.0.10.2 service-1 10.0.17.1 172.10.0.10 10.0.30.10 curl https://10.0.30.1:5041/service/service-1/endpoint Love getting

    those OK responses!
  80. Rollback service-1 10.0.10.1 service-2 { upstream service-1 { server 10.0.10.1:5041;

    } upstream bridge-1 { server 10.0.20.1:5041; .... }
  81. foo 10.0.10.4 service-1 172.10.0.10 10.0.30.10

  82. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 apiVersion: v1 kind:

    Service metadata: name: foo labels: app: foo spec: ports: - port: 5040 protocol: TCP name: http selector: app: nginx-skyline-router
  83. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 curl http://foo:5040/endpoint

  84. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 curl http://foo:5040/endpoint

  85. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 curl http://foo:5040/endpoint

  86. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 # match on:

    # service.namespace.svc.cluster.local # service.namespace # service server_name REGEX …. location / { rewrite ^/(.*)$ /service/$service/$1 break; proxy_pass https://egress_bridge; }
  87. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 curl http://bridge1:5041/service/foo/endpoint bridge-1

    (multi-dc aware)
  88. foo 10.0.10.4 foo 10.0.17.100 service-1 172.10.0.10 10.0.30.10 bridge-1 (multi-dc aware)

    curl http://10.0.10.4:5041/service/foo/endpoint
  89. Project Skeleton

  90. Ship it! ./project-generator

  91. Microservice Pipeline ./project-generator Pipeline as Code Kuberenetes

  92. Documentation

  93. Live Demo

  94. None
  95. Migration Results • Moved 20 services • Time: 1 ½

    month • People: 3
  96. None
  97. Things fail , Let’s talk about it…. • “The bad

    config outage” • “The classic security group fail”
  98. Lessons Learned/Conclusion

  99. Choose the least important service

  100. Have a rollback plan

  101. Write down what your deployment pipeline looks like

  102. Documentation should be written for humans to read

  103. Pragmatic

  104. Minimizing disruption = Great Adoption

  105. • Migrating Container Schedulers: http://code.hootsuite.com/migrating-container-orchestrators-mesos-kubernetes-n omad/ • Abstracting Marathon Deployment

    Details from Microservices: http://code.hootsuite.com/abstracting-marathon-deployment-details-from-microse rvices/ • Consul: https://www.consul.io/ Links
  106. @anubhavm Anubhav Mishra Thank you!

  107. Developer Advocate - HashiCorp @anubhavm Anubhav Mishra I am joining

    https://medium.com/@anubhavmishra/i-am-joining-hashicorp-5a38e0977867