Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How GitLab scaled Git access with a Go service

How GitLab scaled Git access with a Go service

Back in the early days of GitLab.com, we've used to run Rails worker processes, Sidekiq background processes, and Git storage on a single server, a typical monolith. As expected, it's easy to deploy and maintain, though really hard to scale. Having a single server means we're only able to scale vertically, which becomes increasingly expensive and limiting.

At a given time, we had to run multiple servers, which meant Git repositories had to be available to all server nodes. For that, we first went for the "quick-win", NFS. It rapidly showed its limitations regarding observability, single point of failure and high latency.

Beside alternative solutions, we decided to slightly redesign the system and create a Go service called Gitaly, which now acts as a "git database" service for GitLab.com.

This talk will go through the history of how we've gradually separated it from the monolith using Feature flags, Protocol Buffers, gRPC and Go.

Oswaldo Ferreira

August 16, 2019
Tweet

Other Decks in Technology

Transcript

  1. How GitLab scaled Git access with a Go service Oswaldo

    Ferreira (@olsfer) Backend Engineer at GitLab
  2. Hi • Backend Engineer at GitLab Source Code Team •

    Merge Requests • Code Review • Web IDE • Enjoy hunting performance issues at GitLab.com • Moved to São Paulo about a year ago ✈
  3. Today we'll cover • Introduction: What GitLab is • How

    GitLab use Git • How GitLab scaled Git storage and access • Limitations • Gitaly • gRPC, Protocol Buffers and Prometheus monitoring • How Gitaly fits into GitLab
  4. Introduction (GitLab) • Single application for the whole DevOps lifecycle

    • Git repository hosting • Code reviewing • CI/CD • Monitoring • Security Testing • Etc • Open source projects written in Ruby, Javascript and Go
  5. Introduction (GitLab) • Started as a self-hosted application: You host

    your own GitLab instance • GitLab.com (SaaS) now handles about 3k Git operations per second (with around 9 million projects) • GitLab.com runs at GitLab Enterprise Edition, which has its stable packages released (for self-hosted clients) every 22nd
  6. A bit of history • Early days of GitLab.com •

    Most of the application in a single server (Unicorn, Sidekiq, Git storage) • Easy to deploy and maintain • Only vertical scaling • Out of options for continuing scaling GitLab.com vertically • Horizontal scaling had to be made possible
  7. How applications access Git • Libgit2 through Rugged • Libgit2

    is a C implementation of Git core methods • Rugged is Ruby binding for libgit2 (so everyone could contribute) • Directly through Git command-line • A consolidated internal interface (Ruby) to interact with Git was built through time (reading and writing)
  8. Ideas & Decisions • Concentrate all Git access logic within

    a single codebase (acting as a "git database") • gRPC and Protocol Buffers (cross-platform RPC and well defined API) • Written in Go • Prometheus for monitoring
  9. git-1.server unicorn-1.server gRPC Make git work "locally" return the results

    over the network • NFS servers become Gitaly servers • Gitaly has direct disk access (no more NFS latency)
  10. Development • New team to develop Gitaly • Slowly rollout

    and use it both on staging and production before Gitaly 1.0 (using feature flags)
  11. • Open source message exchanging framework • Used by Slack

    and other beauties • Low latency, highly scalable • HTTP/2 • Protocol buffers as a descriptive language for interfaces
  12. pBuffers ✨ gRPC code .proto Ruby (Gitaly gem) ref_pb.rb Go

    bindings ref.pb.go Ruby Go protobuf tool C++, Java, Python, Go, Ruby, C#, Node.js, Objective-C, etc
  13. Advantages for our scenario • Mature interface for interacting with

    Git • Server to server binary message streaming • Numbered fields are powerful for versioning (backward compatible by default) • Language interoperability: Ruby client, Go server
  14. • Open source written in • Time series based monitoring

    • Own query language (PromQL) • Handles alerting • Great for monitoring (not great for general logging) • Error ratio, Request ratio, cache hit/miss
  15. sum() • Faster message exchanging • Reliable typed interfaces between

    client/server • More visibility • Self-contained logic • Higher entry barrier for contributions • Additional complexity for maintaining • Funny enough: Higher visibility made us slower at first