Confidential & Proprietary Google Cloud Platform 1 By Lee Boonstra, Customer Engineer Google Cloud [email protected] Building software at Google Scale How does Google build software & How can you benefit from this
2 1. How the engineering processes at Google works Engineering at Google 3. From open source to Google Cloud for enterprises 2. Our learnings, how we contribute back to open source
Google Cloud Platform 13 What it takes to be a Google engineer Working on problems with SPEED AND SCALE is a challenge. Engineers keep raising the bar on the tools and infrastructure. Google Culture: • Collaboration and co-development • Sharing between products and teams (tools, libraries, services) • Engineers have autonomy. • Agile/Scrum, daily stand-up meetings
Google Repository statistics As of Jan 2015 Total number of files 1+ billion Number of source files 9 million Lines of code 2+ billion Depth of history 35 million commits Size of content 86 terabytes
Google Cloud Platform 17 Advantages of monolithic repo ● Unified versioning - One source of truth ● Extensive code sharing and reuse ● Collaboration across teams ● Simplified dependency management ● Large scale refactoring ● Flexible team boundaries & code ownership ● Code visibility
Google Cloud Platform 18 Automated Test / Analysis Google uses its own version control system called: Piper Sync workspace Write code Code Review Commit Read/Write Access per folder Code Quality & Syntax Check (by humans and by tooling) Create personal copy Auto Rollback if needed MANDATORY A single code tree, with fast access to the code through tooling. All types of code languages. Everyone, works in Trunk. - Branches are for releases.
Google Cloud Platform 22 Build systems Why do we need build systems? Well code has a lot of dependencies and you don’t want to compile and link these all manually. The steps of a general build system: 1. Loading 2. Analysis 3. Execution by build system
Google Cloud Platform 23 Google’s continuous build and test system Google has its own continuous build & test system. Remember, at Google we develop everything at HEAD in the repo. Endless CPU, Cross User Caching, because of Cloud Computing.
Enter the container Virtual machine OS Dependencies Application Code Hardware Bare-metal server OS Dependencies Application Code Hardware Container OS Dependencies Application Code Hardware
Google Cloud Platform 27 So, you mean Docker? 2004 2016 ● Docker is a popular software container platform. ● Containers are a way to package software in a format that can run isolated on a shared operating system.
Enter the container… and new challenges ● Scheduling, scaling across clusters of servers ● Networking and connectivity ● Security and Access control ● Logging, Monitoring, and Debugging ● Health checks and uptime preservation ● ...
Google Cloud Platform 29 Large-scale cluster management at Google with Borg 2004 2016 ● It’s software that manages all production machines at Google and runs jobs (binaries) that engineers give it on them. ● Borg ran pretty much everything inside the company, including Google Search, Gmail, Google Maps, Google Docs... ● These binaries are run in a container environment. ● When tasks die, they are automatically started up again, and they may run on a different machine.
“Hope is not a strategy. Engineering solutions to design, build, and run large-scale systems scalably, reliably and efficiently is a strategy, and a good one.”
32 Site Reliability Engineering ● Site Reliability Engineering is a specialized job function that focuses on the reliability and maintainability of large systems. ● SRE is also a mindset, and a set of engineering approaches to running better production systems ● Google has SRE teams of site reliability engineers responsible for a service globally available. https://landing.google.com/sre/book.html
37 https://research.google.com/ Google wrote lots of white papers which inspires the big data community. ● Bigtable ● GFS ● Mapreduce ● Chubby ● Sawzall ● Dapper ● Dremel ● Borg
Google Cloud Platform 38 From Google to OSS 2004 2016 Internal Google ● Internal Build System ● Borg Container Orchestration ● Machine Learning ● Go Lang ● Google Chrome Open Source ● Bazel ● Kubernetes ● Tensorflow ● Go Lang ● Chromium
39 Tensorflow Tensorflow is what we use for our own internal machine learning projects, and now it’s available to you! Google made it open source. More than 480 contributions 10,000 commits in a year 53k star rating Tutorials to get started at https://www.tensorflow.org
Google Cloud Platform 40 Bazel You will need a build system, if you work with teams. Google’s build system, is now available open source. Google has been working on this for more than 10 years. Now you can benefit from this. https://bazel.build/ ● Scalable: Bazel helps you scale your organization, codebase and Continuous Integration system. It handles codebases of any size, in multiple repositories or a huge monorepo. ● Platform independent: Works on Cloud or On Premise. ● Any language: Build and test Java, C++, Android, iOS, Go and a wide variety of other language platforms (via extensions).
41 Kubernetes abstracts away the hardware infrastructure and exposes your whole data center as a single enormous computing resource. ● Multiple container engines (Docker, rkt, Windows) ● Cloud and bare-metal environments ● Container Engine = Managed Kubernetes in Google Cloud Kubernetes https://kubernetes.io
42 Kubernetes Open Source Community 50k+ commits in Kubernetes 1,000+ unique contributors Top 0.001% of all GitHub Projects 4000+ External Projects Based on Kubernetes Companies Contributing Supported by a broad ecosystem of partners, offering you cloud provider flexibility:
43 ● A complete framework for connecting, securing, managing and monitoring services ● Secure and monitor traffic for microservices and legacy services without requiring any changes to application code ● An open platform with key contributions from Google, IBM, Lyft and others ● Allows developers to authenticate and secure the communications between different applications using a TLS connection ● Multi-environment and multi-platform, but Kubernetes first Istio https://istio.io
Google Cloud Platform 50 Learnings From Google to Google Cloud 2004 2016 Google ● Build for Scalability ● Build for Security Google Cloud ● Build for Enterprise ○ Secure ○ Scalable ○ Compliant
Google Cloud Platform 51 1+ Billion Users ● 2 trillion Google searches annually ● 65 billion downloads of apps from its Google Play store. ● More than 1 billion people are using the Chrome browser on mobile devices every month. ● 200 million people per month are using its online photo service, Google Photos.
Confidential & Proprietary Assessing Threats Who is the attacker? Lone-wolves Script kiddies Insider Risk Hacktivist groups Malicious users Criminal organizations Nation-state actors How are they attacking? DDoS Spear-phishing Malware XSS Man-in-the-middle User error Social 0-days What do they want? $$$$$ Intellectual property Espionage Vandalism Public perception Notoriety
58 Google has over a decade experience with building secure software on large scale. Conclusion Your company can make use of the same infrastructure like Google does. Scalable, Secure and Open. The learnings are shared through whitepapers and contributed back through open source.