Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Google's Production Environment

Google's Production Environment

Google datacenters are very different from most conventional datacenters and small-scale server farms. These differences present both extra problems and opportunities. This talk discusses the challenges and opportunities that characterize Google datacenters. It is based on the Google SRE book: https://landing.google.com/sre/sre-book/chapters/production-environment/

Florian Rathgeber

February 08, 2020
Tweet

More Decks by Florian Rathgeber

Other Decks in Technology

Transcript

  1. #GoogleSandbox
    Google's Production Environment
    Florian Rathgeber, Site Reliability Engineer, Google Cloud

    View Slide

  2. #GoogleSandbox
    Florian
    Site Reliability Engineer
    Google Cloud
    SRE for 2+ years
    ● On the Cloud Console SRE team
    ● Spend most of my time on
    SLOs
    Previous life
    ● Computational Scientist @
    Imperial College
    ● Data Engineer @ ECMWF
    Co-founded PyData London

    View Slide

  3. #GoogleSandbox
    Google's Globe Spanning Network
    Submarine cable investments
    Current fiber network
    https://cloud.google.com/about/locations/

    View Slide

  4. #GoogleSandbox
    ● B4
    ● Edge
    Network
    ● GSLB
    ● Jupiter
    Google Data Centers
    ● GFE
    Current Regions & Number of Zones
    Future Regions & Number of Zones
    https://cloud.google.com/about/locations/

    View Slide

  5. #GoogleSandbox
    ● Campus
    ● Data center
    ● Cluster
    ● Row
    ● Rack
    ● Machine
    Data Center Setup

    View Slide

  6. #GoogleSandbox
    Scheduler
    BorgMaster
    Persistent
    store
    Cluster
    Config
    Files
    Tools
    Borglet Borglet Borglet
    BNS addresses: /bns////
    Cluster Management

    View Slide

  7. #GoogleSandbox
    Chubby
    Consistent data, e.g.
    - BNS paths->IP addresses
    - master election
    Chubby Chubby
    Paxos Paxos
    Cluster Cluster Cluster
    Lock Service

    View Slide

  8. #GoogleSandbox
    D
    HDD SSD
    Colossus
    Bigtable
    Spanner ...
    ...
    Cluster
    Spanner
    Cluster
    ...
    ...
    ...
    Storage

    View Slide

  9. #GoogleSandbox
    Server
    Scraping Borgmon
    Cluster Borgmon
    Cluster
    Scraping Borgmon
    Cluster Borgmon
    Cluster
    Scraping Borgmon
    Cluster Borgmon
    Global Borgmon
    Cluster
    Time
    Series
    Database
    Alert
    Manager
    1
    Prober Server Prober Server Prober
    Data Alerts
    Monitoring

    View Slide

  10. #GoogleSandbox
    Service
    Client
    Client
    Stubby Server
    Stubby
    Stub
    Stubby
    Stub
    C++
    Java
    Ruby
    protobuf request
    protobuf response
    protobuf request
    protobuf response
    Server Communication

    View Slide

  11. #GoogleSandbox
    Piper
    Code Repository
    Author
    Changelist
    Reviewer
    Looks Good
    To Me
    Owner
    Approval
    Presubmit
    Checks
    OK!
    submit...done!
    change
    Code Repository

    View Slide

  12. #GoogleSandbox
    MPM
    Piper
    Code Repository Blaze
    Continuous
    Testing
    Framework
    Binaries
    Tests
    PASS
    FAIL
    PASS
    ...
    Rapid
    Sisyphus
    Production
    Continuous Build and Deployment

    View Slide

  13. #GoogleSandbox
    Tying it all together...
    ● Develop the software: Piper, Blaze
    ● Build the MPMs: Rapid
    ● Run it in a cluster: Borg, which uses Chubby
    ● Route requests/responses: GFE, GSLB, ProtoBuf, Stubby
    ● Store and read messages: Colossus, Bigtable, Spanner
    ● Monitor and fire alerts: Borgmon
    ● Roll out new versions: Sisyphus

    View Slide

  14. #GoogleSandbox
    ● Cluster management: Kubernetes kubernetes.io
    ● Lock service: ZooKeeper zookeeper.apache.org, etcd coreos.com/etcd
    ● Storage: HDFS hadoop.apache.org, Cassandra cassandra.apache.org
    ● Monitoring: Prometheus prometheus.io
    ● RPC: gRPC grpc.io
    ● Data serialization: Protocol Buffers developers.google.com/protocol-buffers
    ● Google style guides github.com/google/styleguide
    ● The Go programming language golang.org
    ● Code repository: Git git-scm.com
    ● Code review: Rietveld github.com/rietveld-codereview/rietveld
    ● Building: Bazel bazel.io
    List of related open-source projects

    View Slide

  15. #GoogleSandbox
    Cover images used with permission. These books can be found on shop.oreilly.com
    The full text of the Google SRE Books are available at www.google.com/sre

    View Slide