Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Software at Google Scale

Building Software at Google Scale

Lee Boonstra

August 14, 2017
Tweet

More Decks by Lee Boonstra

Other Decks in Technology

Transcript

  1. Think 10x! - Dare to be audacious! FROM IDEA TO

    PRODUCT > WRITING CODE > TESTING/BUILDING CODE > DEPLOYING CODE > SRE
  2. Google Cloud Platform 4 1+ Billion Users • 2 trillion

    Google searches annually • 65 billion downloads of apps from its Google Play store. • More than 1 billion people are using the Chrome browser on mobile devices every month. • 200 million people per month are using its online photo service, Google Photos.
  3. Agile Development teams FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT
  4. Google Cloud Platform 13 Agile/Scrum process at Google Release Backlog

    Release Planning Scrum Sprint Task List Product Backlog Sprint Planning User Stories Working Software Sprint Review Sprint Retrospective Product Vision
  5. Google Cloud Platform 14 What it takes to be a

    Google engineer Working on problems with SPEED AND SCALE is a challenge. Engineers keep raising the bar on the tools and infrastructure. Google Culture: • Collaboration and co-development • Sharing between products and teams (tools, libraries, services) • Engineers have autonomy. • Agile/Scrum, daily stand-up meetings
  6. The Google Codebase FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT DEVELOPING
  7. Google Repository statistics As of Jan 2015 Total number of

    files 1+ billion Number of source files 9 million Lines of code 2+ billion Depth of history 35 million commits Size of content 86 terabytes
  8. Google builds Commits per workday 45 thousand Number of file

    read requests Billions * Number of builds 800k Build output 2+ petabytes Test cases run 75+ million *(800K QPS daily peak) Source: Google Internal Data
  9. Google Cloud Platform 20 Advantages of monolithic repo • Unified

    versioning - One source of truth • Extensive code sharing and reuse • Collaboration across teams • Simplified dependency management • Large scale refactoring • Flexible team boundaries & code ownership • Code visibility
  10. Google Cloud Platform 21 Automated Test / Analysis Google uses

    its own version control system called: Piper Sync workspace Write code Code Review Commit Read/Write Access per folder Code Quality & Syntax Check (by humans and by tooling) Create personal copy Auto Rollback if needed MANDATORY A single code tree, with fast access to the code through tooling. All types of code languages. Everyone, works in Trunk. - Branches are for releases.
  11. Tooling @ Google FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT DEVELOPING TESTING
  12. Tooling @ Google FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT DEVELOPING TESTING BUILDING
  13. Google Cloud Platform 25 Build systems Why do we need

    build systems? Well code has a lot of dependencies and you don’t want to compile and link these all manually. The steps of a general build system: 1. Loading 2. Analysis 3. Execution by build system
  14. Google Cloud Platform 26 Google’s continuous build and test system

    Google has its own continuous build & test system. Remember, at Google we develop everything at HEAD in the repo. Endless CPU, Cross User Caching, because of Cloud Computing.
  15. Google Cloud Platform 27 Bazel You will need a build

    system, if you work with teams. Google’s build system, is now available open source. Google has been working on this for more than 10 years. Now you can benefit from this. https://bazel.build/ • Scalable: Bazel helps you scale your organization, codebase and Continuous Integration system. It handles codebases of any size, in multiple repositories or a huge monorepo. • Platform independent: Works on Cloud or On Premise. • Any language: Build and test Java, C++, Android, iOS, Go and a wide variety of other language platforms (via extensions).
  16. Google Cloud Platform 28 Bazel 1. You will need to

    write a BUILD file. The rule (in this case Java), tells Bazel to build a .jar file and a wrapper shell script (both named after the target). 2. To build you will run from the command-line bazel build or bazel test 3. It will build all source files: See: https://github.com/bazelbuild/examples/ 1.Write BUILD file java_binary( name = "ProjectRunner", srcs = glob(["src/main/java/com/example/*.java"]), ) 2.Execute on CLI: $ bazel build :ProjectRunner INFO: Found 1 target... Target //:ProjectRunner up-to-date: bazel-bin/ProjectRunner.jar bazel-bin/ProjectRunner INFO: Elapsed time: 1.021s, Critical Path: 0.83s
  17. Devops @ Google FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT DEVELOPING TESTING BUILDING DEPLOYING
  18. Each week Google launches over 2 billion containers. Each week

    Google launches over 2 billion containers. Google is using container technology for more than 10 years.
  19. Google Cloud Platform 32 So, you mean Docker? 2004 2016

    • Docker is a popular software container platform. • Containers are a way to package software in a format that can run isolated on a shared operating system.
  20. Google Cloud Platform 33 Containers at Google 2004 2016 Core

    Ops Team Number of running jobs • The key to efficiently manage systems • Enabled Google to grow our fleet of applications 10x faster than we grow our Ops team
  21. Google Cloud Platform 34 Large-scale cluster management at Google with

    Borg 2004 2016 • It’s software that manages all production machines at Google and runs jobs (binaries) that engineers give it on them. • Borg ran pretty much everything inside the company, including Google Search, Gmail, Google Maps, Google Docs... • These binaries are run in a container environment. • When tasks die, they are automatically started up again, and they may run on a different machine.
  22. Google Cloud Platform 35 History of Containers 2004 2016 2004

    2006 2013 2014 Limited isolation Released CGroups Docker Kubernetes Borg Apache Mesos
  23. Google Cloud Platform 36 web browsers BorgMaster link shard UI

    shard BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard Cell Schedul borgcfg web browsers scheduler Borglet Borglet Borglet Borglet BorgMaster link shard read/UI shard Config file persistent store (Paxos) Binary web browsers BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard Cluster Schedu kubectl web browsers scheduler kubelet kubelet kubelet kubelet k8s Master controller manager API server Config file persistent store (etcd) Binary Borg Kubernetes
  24. 37 Kubernetes abstracts away the hardware infrastructure and exposes your

    whole data center as a single enormous computing resource. • Multiple container engines (Docker, rkt, Windows) • Cloud and bare-metal environments • Container Engine = Managed Kubernetes in Google Cloud Kubernetes (Container Orchestration) https://kubernetes.io
  25. 38 Kubernetes Open Source Community 50k+ commits in Kubernetes 1,000+

    unique contributors Top 0.001% of all GitHub Projects 4000+ External Projects Based on Kubernetes Companies Contributing Supported by a broad ecosystem of partners, offering you cloud provider flexibility:
  26. 39 Learn Kubernetes Learn how to containerise workloads, deploy them

    to Google Container Engine clusters, scale them to handle increased traffic, and continuously deploy to provide application updates. Kubernetes Kickstart Training 20 September Eindhoven, 9:00 - 17:00 https://cloudplatformonline.com/EMEA-Kubernetes-Bootcamp-Eindhoven.html
  27. 40 • A complete framework for connecting, securing, managing and

    monitoring services • Secure and monitor traffic for microservices and legacy services without requiring any changes to application code • An open platform with key contributions from Google, IBM, Lyft and others • Allows developers to authenticate and secure the communications between different applications using a TLS connection • Multi-environment and multi-platform, but Kubernetes first Istio (A Service Mesh) https://istio.io
  28. 42 Open Source, Open Cloud, multi-cloud delivery tool for deploying

    container images, artifacts, DEB package into production cloud environments. • Project Asgard, started inside at Netflix, Google and others joined in • Deploy applications quickly, reliably and safely • Install on-prem, VM or Kubernetes • Includes a rich UI dashboard • Integrates seamlessly with your existing continuous integration (CI) workflows. Trigger pipelines from git, Jenkins, Travis CI, Docker registries, on a cron-like schedule, or even other pipelines. https://www.spinnaker.io Spinnaker (Continuous Delivery)
  29. SRE @ Google FROM IDEA TO PRODUCT > WRITING CODE

    > TESTING/BUILDING CODE > DEPLOYING CODE > SRE FROM IDEA TO PRODUCT DEVELOPING TESTING BUILDING DEPLOYING SITE RELIABILITY
  30. “Hope is not a strategy. Engineering solutions to design, build,

    and run large-scale systems scalably, reliably and efficiently is a strategy, and a good one.”
  31. 46 Site Reliability Engineering • Site Reliability Engineering is a

    specialized job function that focuses on the reliability and maintainability of large systems. • SRE is also a mindset, and a set of engineering approaches to running better production systems • Google has SRE teams of site reliability engineers responsible for a service globally available.
  32. 47 Site Reliability Engineering Google Members of the SRE team

    explain how their engagement with the entire software lifecycle has enabled Google to build, deploy, monitor, and maintain some of the largest software systems in the world. Read the book online for free: https://landing.google.com/sre/book.html
  33. 49 Google is leader in Open Source 287,024 Commits by

    Googlers to Open Source Projects on GitHub in 2016 15,000+ Projects Contributed to in 2016
  34. 52 Google is launched in March this year, all of

    the company’s open source projects under a single umbrella. The code of these projects will live on Github and Google’s self-hosted Git service, but this site functions as a central directory. It also provide a “look under the hood” of how Google does open source. https://opensource.google.com
  35. 53 https://research.google.com/ Google wrote lots of whitepapers which inspires the

    big data community. • Bigtable • GFS • Mapreduce • Chubby • Sawzall • Dapper • Dremel
  36. Google Cloud Platform 54 Conclusion Building Software on Google Scale,

    requires a changes in the way of working. Think Big! Agile. Prototyping. But also Tooling. Building Software on Google Scale, use open source: • Open Source @ Google - https://opensource.google.com • Google Whitepapers - https://research.google.com • Bazel - https://bazel.build - Google’s Build & Test System • Bazel - Code Example: https://github.com/bazelbuild/examples • Kubernetes - https://kubernetes.io • Istio - https://istio.io • Spinnaker - https://www.spinnaker.io • SRE free O’Reilly ebook - https://landing.google.com/sre/book.html Building Software on Google Scale, contribute to open source! [email protected]