Slide 1

Slide 1 text

Confidential & Proprietary Google Cloud Platform 1 By Lee Boonstra, Customer Engineer Google Cloud [email protected] Building software at Google Scale How does Google build software & How can you benefit from this

Slide 2

Slide 2 text

2 1. How the engineering processes at Google works Engineering at Google 3. From open source to Google Cloud for enterprises 2. Our learnings, how we contribute back to open source

Slide 3

Slide 3 text

Confidential & Proprietary Google Cloud Platform 3 Building software at Google

Slide 4

Slide 4 text

Confidential & Proprietary Google Cloud Platform 4 From product to idea 10x Product idea X 10

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

6 “To organize the world’s information and make it universally accessible and useful.” - Google

Slide 7

Slide 7 text

Project Loon: Balloon powered internet for everyone!

Slide 8

Slide 8 text

Waymo: Self driving car

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Prototyping: First version of Google class was created in 90 min!

Slide 11

Slide 11 text

Dogfood

Slide 12

Slide 12 text

Confidential & Proprietary Google Cloud Platform 12 Code Development Product idea Writing code public class foo {}

Slide 13

Slide 13 text

Google Cloud Platform 13 What it takes to be a Google engineer Working on problems with SPEED AND SCALE is a challenge. Engineers keep raising the bar on the tools and infrastructure. Google Culture: • Collaboration and co-development • Sharing between products and teams (tools, libraries, services) • Engineers have autonomy. • Agile/Scrum, daily stand-up meetings

Slide 14

Slide 14 text

Google’s entire codebase is a giant single repository of more than 2 billion lines of code

Slide 15

Slide 15 text

Google Repository statistics As of Jan 2015 Total number of files 1+ billion Number of source files 9 million Lines of code 2+ billion Depth of history 35 million commits Size of content 86 terabytes

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Google Cloud Platform 17 Advantages of monolithic repo ● Unified versioning - One source of truth ● Extensive code sharing and reuse ● Collaboration across teams ● Simplified dependency management ● Large scale refactoring ● Flexible team boundaries & code ownership ● Code visibility

Slide 18

Slide 18 text

Google Cloud Platform 18 Automated Test / Analysis Google uses its own version control system called: Piper Sync workspace Write code Code Review Commit Read/Write Access per folder Code Quality & Syntax Check (by humans and by tooling) Create personal copy Auto Rollback if needed MANDATORY A single code tree, with fast access to the code through tooling. All types of code languages. Everyone, works in Trunk. - Branches are for releases.

Slide 19

Slide 19 text

Confidential & Proprietary Google Cloud Platform 19 Software testing Product idea Writing code Testing

Slide 20

Slide 20 text

Google Cloud Platform 20 Testing at Google ● Developing & Testing go hand in hand ● 3 million test a day ● 20+ OS and Browser combos

Slide 21

Slide 21 text

Confidential & Proprietary Google Cloud Platform 21 Build processes Product idea Writing code Testing Building

Slide 22

Slide 22 text

Google Cloud Platform 22 Build systems Why do we need build systems? Well code has a lot of dependencies and you don’t want to compile and link these all manually. The steps of a general build system: 1. Loading 2. Analysis 3. Execution by build system

Slide 23

Slide 23 text

Google Cloud Platform 23 Google’s continuous build and test system Google has its own continuous build & test system. Remember, at Google we develop everything at HEAD in the repo. Endless CPU, Cross User Caching, because of Cloud Computing.

Slide 24

Slide 24 text

Confidential & Proprietary Google Cloud Platform 24 Devops at Google Product idea Writing code Testing Building Deploying

Slide 25

Slide 25 text

Each week Google launches over 4 billion containers. Google is using container technology for more than 10 years.

Slide 26

Slide 26 text

Enter the container Virtual machine OS Dependencies Application Code Hardware Bare-metal server OS Dependencies Application Code Hardware Container OS Dependencies Application Code Hardware

Slide 27

Slide 27 text

Google Cloud Platform 27 So, you mean Docker? 2004 2016 ● Docker is a popular software container platform. ● Containers are a way to package software in a format that can run isolated on a shared operating system.

Slide 28

Slide 28 text

Enter the container… and new challenges ● Scheduling, scaling across clusters of servers ● Networking and connectivity ● Security and Access control ● Logging, Monitoring, and Debugging ● Health checks and uptime preservation ● ...

Slide 29

Slide 29 text

Google Cloud Platform 29 Large-scale cluster management at Google with Borg 2004 2016 ● It’s software that manages all production machines at Google and runs jobs (binaries) that engineers give it on them. ● Borg ran pretty much everything inside the company, including Google Search, Gmail, Google Maps, Google Docs... ● These binaries are run in a container environment. ● When tasks die, they are automatically started up again, and they may run on a different machine.

Slide 30

Slide 30 text

Confidential & Proprietary Google Cloud Platform 30 Site Reliability Engineering Product idea Writing code Testing Building Deploying SRE

Slide 31

Slide 31 text

“Hope is not a strategy. Engineering solutions to design, build, and run large-scale systems scalably, reliably and efficiently is a strategy, and a good one.”

Slide 32

Slide 32 text

32 Site Reliability Engineering ● Site Reliability Engineering is a specialized job function that focuses on the reliability and maintainability of large systems. ● SRE is also a mindset, and a set of engineering approaches to running better production systems ● Google has SRE teams of site reliability engineers responsible for a service globally available. https://landing.google.com/sre/book.html

Slide 33

Slide 33 text

Confidential & Proprietary Google Cloud Platform 33 Open Source Googlers contribute back to the community.

Slide 34

Slide 34 text

34 Google is leader in Open Source 287,024 Commits by Googlers to Open Source Projects on GitHub in 2016 15,000+ Projects Contributed to in 2016

Slide 35

Slide 35 text

35 Popular Google open source projects https://opensource.google.com

Slide 36

Slide 36 text

36 Contributions to other popular open source projects and standards by Google

Slide 37

Slide 37 text

37 https://research.google.com/ Google wrote lots of white papers which inspires the big data community. ● Bigtable ● GFS ● Mapreduce ● Chubby ● Sawzall ● Dapper ● Dremel ● Borg

Slide 38

Slide 38 text

Google Cloud Platform 38 From Google to OSS 2004 2016 Internal Google ● Internal Build System ● Borg Container Orchestration ● Machine Learning ● Go Lang ● Google Chrome Open Source ● Bazel ● Kubernetes ● Tensorflow ● Go Lang ● Chromium

Slide 39

Slide 39 text

39 Tensorflow Tensorflow is what we use for our own internal machine learning projects, and now it’s available to you! Google made it open source. More than 480 contributions 10,000 commits in a year 53k star rating Tutorials to get started at https://www.tensorflow.org

Slide 40

Slide 40 text

Google Cloud Platform 40 Bazel You will need a build system, if you work with teams. Google’s build system, is now available open source. Google has been working on this for more than 10 years. Now you can benefit from this. https://bazel.build/ ● Scalable: Bazel helps you scale your organization, codebase and Continuous Integration system. It handles codebases of any size, in multiple repositories or a huge monorepo. ● Platform independent: Works on Cloud or On Premise. ● Any language: Build and test Java, C++, Android, iOS, Go and a wide variety of other language platforms (via extensions).

Slide 41

Slide 41 text

41 Kubernetes abstracts away the hardware infrastructure and exposes your whole data center as a single enormous computing resource. ● Multiple container engines (Docker, rkt, Windows) ● Cloud and bare-metal environments ● Container Engine = Managed Kubernetes in Google Cloud Kubernetes https://kubernetes.io

Slide 42

Slide 42 text

42 Kubernetes Open Source Community 50k+ commits in Kubernetes 1,000+ unique contributors Top 0.001% of all GitHub Projects 4000+ External Projects Based on Kubernetes Companies Contributing Supported by a broad ecosystem of partners, offering you cloud provider flexibility:

Slide 43

Slide 43 text

43 ● A complete framework for connecting, securing, managing and monitoring services ● Secure and monitor traffic for microservices and legacy services without requiring any changes to application code ● An open platform with key contributions from Google, IBM, Lyft and others ● Allows developers to authenticate and secure the communications between different applications using a TLS connection ● Multi-environment and multi-platform, but Kubernetes first Istio https://istio.io

Slide 44

Slide 44 text

Istio benefits: enabling hybrid GKE on GCP VMs on GCE (or elsewhere) K8s on-prem Vendor-managed K8s. EKS? AKS?

Slide 45

Slide 45 text

Google Cloud Google infrastructure for your company. Open Source

Slide 46

Slide 46 text

46 Storage Compute

Slide 47

Slide 47 text

Google Cloud Platform 47 From OSS to Google Cloud 2004 2016 Open Source ● Kubernetes ● Istio ● Tensorflow ● MySQL / Postgresql ● Spark / Hadoop ● Apache Beam ● iPython Google Cloud ● Google Kubernetes Engine ● Managed Istio ● ML Engine ● Cloud SQL ● Dataproc ● Dataflow ● Datalab

Slide 48

Slide 48 text

Then we got serious. We built our own hardware for AI. Cloud Machine Learning Engine

Slide 49

Slide 49 text

Training a large-scale machine translation model on 32 GPUs on ⅛ of a TPU Pod

Slide 50

Slide 50 text

Google Cloud Platform 50 Learnings From Google to Google Cloud 2004 2016 Google ● Build for Scalability ● Build for Security Google Cloud ● Build for Enterprise ○ Secure ○ Scalable ○ Compliant

Slide 51

Slide 51 text

Google Cloud Platform 51 1+ Billion Users ● 2 trillion Google searches annually ● 65 billion downloads of apps from its Google Play store. ● More than 1 billion people are using the Chrome browser on mobile devices every month. ● 200 million people per month are using its online photo service, Google Photos.

Slide 52

Slide 52 text

Underwater Fiber-optic Cables: Fast Network infrastructure

Slide 53

Slide 53 text

Confidential & Proprietary Assessing Threats Who is the attacker? Lone-wolves Script kiddies Insider Risk Hacktivist groups Malicious users Criminal organizations Nation-state actors How are they attacking? DDoS Spear-phishing Malware XSS Man-in-the-middle User error Social 0-days What do they want? $$$$$ Intellectual property Espionage Vandalism Public perception Notoriety

Slide 54

Slide 54 text

Confidential & Proprietary Usage Audit Logging Safe Browsing API BeyondCorp Security Key Enforcement Operations Compliance & Certifications Live Migration Infra maintenance & patching Threat analysis and intelligence Open Source Forensics tools Anomaly Detection (Infrastructure) Incident Response (Infrastructure) Deployment Google Services TLS encryption with perfect forward secrecy Certificate Authority Free and automatic certificates DDoS Mitigation (PaaS & SaaS) Application Peer code review & Static Analysis (Infrastructure SLDC) Source code provenance (Infrastructure) Binary Verification (Infrastructure code) WAF (PaaS & SaaS Use cases) IDS/ IPS (PaaS & SaaS Use cases) Web Application Scanner (Google Services) Network Infrastructure RPC encryption in transit between data centres DNS Global Private Network Andromeda SDN Controller Jupiter Datacenter Network B4 SDN Network Storage Encryption at rest Logging Identity and Access Management Global at scale Key Management Service OS + IPC Hardened KVM Hypervisor Authentication for each host and each job Curated Host Images Encryption of Interservice Communications Boot Trusted Boot Cryptographic Credentials Hardware Purpose-built Chips Purpose-built Servers Purpose-built Storage Purpose-built Network Purpose-built Data Centers Infrastructure security

Slide 55

Slide 55 text

Confidential & Proprietary Hardware Hardware Infrastructure: Titan

Slide 56

Slide 56 text

Confidential & Proprietary

Slide 57

Slide 57 text

Confidential & Proprietary Secure yourself on Google Cloud By default Google products Partner tools Other Usage Cloud Audit Logging Safe Browsing API Identity-Aware Proxy Security Key Enforcement Operations Compliance and Certifications Automatic Updates and Patching Threat analysis and intelligence Forensics Anomaly detection Incident Response Deployment Google Services TLS encryption with perfect forward secrecy Certificate Authority Free and automatic certificates DDoS Mitigation via GCLB Alternative DDoS Mitigation Solutions Application Code review & Static Analysis Source code provenance Binary verification WAF IDS/ IPS Vuln Management Network Cloud DNS Cloud VPN Virtual Private Cloud (VPC) Cloud Router Shared VPC NGFW Storage Encryption at rest Logging Identity and Access Management Cloud Key Management Service Customer-Supplied Encryption Keys Data Loss Protection API OS + IPC Hardened KVM Hypervisor Authentication for each host and each job Curated Host Images Encryption of Interservice Communications Boot Trusted Boot Cryptographic Credentials Hardware Purpose-built Chips Purpose-built Servers Purpose-built Storage Purpose-built Network Purpose-built Data Centers Login anomalies for Google Identities Google Managed Infrastructure Foundation Threat Intelligence CDN Cloud Load Balancing Web Application Scanning DLP Secure Config/ Assessment/ Enforcement

Slide 58

Slide 58 text

58 Google has over a decade experience with building secure software on large scale. Conclusion Your company can make use of the same infrastructure like Google does. Scalable, Secure and Open. The learnings are shared through whitepapers and contributed back through open source.