Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Going planet-scale with Cloud Native Technologies

Going planet-scale with Cloud Native Technologies

During the talk, I shared how GitLab's Infrastructure transition and how we are going Cloud Native.

Abubakar Siddiq Ango

November 03, 2018
Tweet

More Decks by Abubakar Siddiq Ango

Other Decks in Technology

Transcript

  1. About Me - Based in Bauchi, Nigeria - Support Engineering

    at GitLab BV - Executive Director, Uplift Nigeria (uplift.ng) - Lead Organizer, GDG Bauchi & Google Cloud Developer Community Bauchi - CTO, GladePay.com
  2. In 2015 - 20,000+ Users - 100,000+ Hosted Repositories -

    2 servers, 1 Active, 1 Backup - Server model: HP DL180 G6 (reconditioned, this model was introduced in 2009) - Processors: 2x X5690 (24 cores in total) - 32GB RAM - 12x 2TB HDDs, (2 for root volume in RAID 1, 10 for storage in RAID 10, ext4 filesystem)
  3. In 2016, on Azure... - Running GitLab.com as an application

    - 5 HAProxy load balancers that are handling GitLab.com HTTP, HTTPS, and SSH - 2 HAProxy load balancers that are handling "alternative SSH" (altssh.GitLab.com) so they do redirection from 443 to 22 - 2 HAProxy load balancers that are handling https://pages.gitlab.io HTTP and HTTPS - 20 workers running GitLab EE application stack (Nginx, Workhorse, Unicorn + Rails, Redis + Sidekiq) - 2 NFS servers for the storage - 2 Redis servers - 2 PostgreSQL servers - 3 Elasticsearch servers - 6 of Azure's "Availability Sets": 3 for load balancers, 1 for Redis HA, 1 for PostgreSQL HA, and 1 for Elasticsearch HA. - 3 servers for GitLab Runners in autoscale mode. See: https://about.gitlab.com/2016/04/29/look-into-gitlab-infrastructure/ With Build Hosts for Shared Runners, between 60 to 200 servers are running at a time.
  4. In 2016 With over 2,000 new repos being created during

    peak hours, and CI runners requesting new builds 3,000,000 times per hour, We built a CephFS cluster to tackle both the capacity and performance issues of using NFS appliances. https://about.gitlab.com/2016/09/26/infrastructure-update/
  5. Late 2016, We almost went bare metal... With latency issues

    using CephFS, we thought of going bare metal: https://about.gitlab.com/2016/11/10/why-choose-bare-metal/ Came up with a server purchase proposal: https://about.gitlab.com/2016/12/11/proposed-server-purchase -for-gitlab-com/ And shared it on YC Hacker News: https://news.ycombinator.com/item?id=13153031 64 nodes with 1TB memories (using 128 GB DIMMS) and 20Gbps of bandwidth, allowing 1.4 PB of raw storage at a replication factor of 3, this is 480TB of usable storage using CephFS
  6. ...then we took a step back after listening to the

    community: https://about.gitlab.com/2017/03/02/why-we-are-not-leaving-t he-cloud/ We decided to do 2 things: • We spread all our storage into multiple NFS shards and dropped CephFS from our stack. • We created Gitaly so that we can stop relying on NFS for horizontal scaling and speed up Git access through caching.
  7. “We want to scale intelligently and build great software; we

    don’t want to be an infrastructure company. We are embracing and are excited about solving the challenge of scaling GitLab.com on the cloud, because solving it for us also solved it for the largest enterprises in the world using GitLab on premise.”
  8. In 2018 We decided to move from Azure to GCP,

    mainly to improve the performance and reliability of GitLab.com and we believe Kubernetes is the future https://about.gitlab.com/2018/06/25/moving-to-gcp/
  9. Cloud Native “Cloud native technologies empower organizations to build and

    run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.”
  10. Why Cloud Native • Right-sized Capacity • Speed • Reliability

    • Collaboration • Continuous Delivery • Automatic Scalability • Rapid Recovery
  11. Key components of Cloud Native • Microservices • CI/CD Toolset

    • Containers e.g. Docker, Containerd • Orchestrators e.g. Kubernetes • Service Meshes e.g. Istio • And others
  12. Continuous Integration / Deployment Test, Build, Deploy your microservices. •

    GitLab CI • Circle CI • Jenkins • TeamCity • And so on.
  13. Containers A container is a standard unit of software that

    packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. Containers enables microservices.
  14. Service Mesh A service mesh is a configurable infrastructure layer

    for a microservices application. It makes communication between service instances flexible, reliable, and fast. The mesh provides service discovery, load balancing, encryption, authentication and authorization, support for the circuit breaker pattern, and other capabilities. https://www.nginx.com/blog/what-is-a-service-mesh/
  15. Abubakar Siddiq Ango, GitLab @sarki247 Lagos Thank you! Slide at

    http://bit.ly/devfestlagos18-cloudnative