Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Overview of products in Google Cloud Platform

Overview of products in Google Cloud Platform

Google Cloud Platform is composed by many products with scattered documentation. This is a two-hour talk specifically focused on putting together some of the most compelling alternatives for computing, storage, and Big Data. Delivered at MAC in Tenerife (2015). https://www.youtube.com/watch?v=3jYExYlS8fw

Nacho Coloma

March 23, 2015
Tweet

More Decks by Nacho Coloma

Other Decks in Technology

Transcript

  1. Intro to Google Cloud Platform Jon Lorenzo - Responsable de

    Cloud Platform para Iberia Google [email protected] @jonlorsan Nacho Coloma — CTO at Extrema Sistemas Google Developer Expert [email protected] @nachocoloma
  2. Cloud Platform solutions IaaS PaaS SaaS Infrastructure-as-a-Service Platform-as-a-Service Software-as-a-Service Applications

    Data Runtime Middleware O/S Virtualization Servers Storage Networking Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking Packaged Software Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking You Manage Vendor Managed
  3. For the past 15 years, Google has been building the

    most powerful cloud infrastructure on the planet. Images by Connie Zhou
  4. Google innovations in the last twelve years Spanner Dremel MapReduce

    Big Table Colossus 2012 2013 2002 2004 2006 2008 2010 GFS Compute Engine We stopped listening here
  5. Google Cloud Platform Storage Cloud Storage Cloud SQL Cloud Datastore

    Compute Compute Engine (IaaS) App Engine (PaaS) Services BigQuery Cloud Endpoints
  6. Google Cloud Platform Storage Cloud Storage Cloud SQL Cloud Datastore

    Compute Compute Engine (IaaS) App Engine (PaaS) Services BigQuery Cloud Endpoints
  7. Simple to Scale - Autoscale Easy to develop - Free

    to start - Build and test locally - Focus on App Code Trivial to manage - Fully managed - No patches/updates - 24x7 operation by Google SREs About Google App Engine
  8. App Engine If you don’t have a DevOps team to

    guarantee these: • Infinite scaling • High availability. • Transparent security upgrades. and instead just want to focus on delivering new features, that’s what App Engine is for.
  9. My system is not always down But when it is,

    I have an entire team of Googlers fixing it
  10. Cloud Endpoints runs on App Engine • Generates REST API

    automatically • JS, Android and iOS client libraries • Server/Client communication is hidden • Runs on same powerful infrastructure, scales infinitely
  11. (Almost) Complete Endpoint example @Api(name = “MyAPI”, version = “v1”,

    namespace = @ApiNamespace(ownerDomain=”foo.com”,ownerName=”mycompany”), clientIds = { Constants.WEB_CLIENT_ID, Constants.ANDROID_CLIENT_ID, Constants.IOS_CLIENT_ID} ) public class MyAPI { @ApiMethod(path="helloworld", name="helloworld") public String hello() { return “Hello World!”; } }
  12. Client side - JS - Calling myApi To call the

    different APIs we need to use the following pattern: gapi.client.{{API}}.{{methodName}}([params]).execute(callback) Examples: gapi.client.myAPI.helloworld().execute(function(resp) { … }); gapi.client.oauth2.userinfo.get().execute(function(resp) { … }); gapi.client.anotherAPI.entries.delete({ id: 2 }).execute(function(resp) { … });
  13. Java Client Example Once we have an instance of our

    API client, we just have to use it like with JS: myAPIService.{{methodName}}.execute() Examples: String message = myAPIService.helloworld().execute(); // helloworld method name Entry entry = myAPIService.entries().get().execute(); // entries.get method name myAPIService.entries().create(entry).execute(); //entries.create method name
  14. Reasons to use endpoints Pros: • SSL is required •

    Generate client library • Android, iOS, JS • Transparent OAuth2 • Hosted on App Engine: scalability, managed, performance • Use as any other Google API Cons: • only *.appspot.com domains • only JSON responses (you can use App Engine for other formats) • only Java and Python • Standard App Engine limits apply (like the 60s request timeout)
  15. Google Identity Toolkit Multiple authentication options Currently Google, Facebook, Yahoo,

    Microsoft, Paypal, and AOL > go to demo site Based on standards from the OpenID Foundation You can test the standard at AccountChooser.com/ Can be used right now A seamless integration with other products in Google Cloud Platform is in the works.
  16. build and deploy vm images • Curated runtimes • Rich

    services • Auto-everything • … just add code • Managed collections • Declarative + Dynamic Platform Cluster VM More Agility More flexibility build and deploy clusters build and deploy apps • Basic atom • Run anything Compute as a Continuum
  17. GCE: Speed and Performance Latency (ms) Other 90 68 45

    23 0 GCE 300 225 150 75 0 Bandwidth (Mbit/s) GCE Other 300 225 150 75 0 Min (s) Max (s) Avg (s) GCE Low Latency High Bandwidth between regions Fast Instance Creation Other Source: "By the Numbers..." Performance Observations
  18. GCE: Speed and Performance “We were able to reproduce the

    result of sorting 1.5 terabytes of data in less than 1 minute consistently across multiple runs and multiple cluster configurations, thanks to the consistent and uniform performance of the Google Compute Environment.” -Yuliya Feldman, MapR • MapR broke the MinuteSort record using Google Cloud Platform • The previous record had been set using customized software and hardware
  19. GCE: Speed and Performance MapR/GCE World Record[1] Previous Record Data

    Sorted 1.5 TB 1.4 TB Number of Servers (virtual instances) 2,103 2,200 Cost per Server 58¢ per instance hour $4,545 Total Cost $20.33 $10,000,000 Time to Build Cluster Minutes Months [1] Source: Hadoop Minutesort Record Minute Sort Test
  20. Enhanced Reliability us-central1-a scheduled maintenance event Live Migration • No

    downtime during scheduled datacenter maintenance events Automatic Restart • Instances automatically restarted if subjected to system events such as hardware failure
  21. Operating System Supported images • Windows 2008 server • Several

    flavors of Linux: CentOS, Debian, Red Hat, SUSE • Community supported: CoreOS, FreeBSD, Ubuntu • Build your own image
  22. Load Balancer Google Cloud Load Balancer Compute Engine Instance Compute

    Engine Instance Compute Engine Instance Compute Engine Instance Google Cloud Load Balancer Google Cloud Load Balancer Healthy Unhealthy Healthy Healthy Easy to set up Fast and reliable performance under predictable load
  23. Google Compute Engine • Sub-hour Billing • Up to 10TB

    Persistent Disk • Snapshotting • Over 64 Instance Types • Instance Metadata and Startup Scripts • Advanced Networking • Load Balancing
  24. Why would you do that? • Reliable deployments: no stress

    while deploying. • Repeatable artifacts: release the exact same container that you use for development. • Loosely coupled: compose applications from microservices.
  25. Containers everywhere · Everything at Google runs in a container

    · Google starts over 2 billion containers per week · With these numbers, we need better abstractions
  26. · Manage a cluster of containers, coupling them as needed

    for your architecture. · Open source · Contributions from IBM, Red Hat, Docker, Mesosphere, SaltStack, CoreOS, etc. · It’s portable · Google Cloud Platform · Other cloud providers · Your own hardware · Your development environment. Kubernetes
  27. Interacting with the user Cloud endpoints Your own HTML /

    JSON API Your app on App Engine (Java, Python, SSL)
  28. Interacting with the user Cloud endpoints Your own HTML /

    JSON API Push notifications Your app on App Engine (Java, Python, SSL)
  29. Interacting with the user Cloud endpoints Your own HTML /

    JSON API Push notifications Your app on App Engine Web Sockets (GCE or custom VMs) (Java, Python, SSL)
  30. Copyright 2015 Google Inc Google Cloud Messaging • Server-to-server scenarios

    within (and across) data centers • Scalable and reliable messaging • Supports many-to-many asynchronous messaging • Integrates with Cloud Dataflow for data processing pipelines • Android, iOS, Web Push • Up to 4KB of payload data • Save battery life by optimizing access to the network on mobile devices. • Free • Android, iOS, Web • Websockets • gRPC streaming mode using HTTP/2 • Android, iOS, Web • Offline support • Keeps a copy of your data on the server Cloud Pub/Sub Firebase Roll your own on GCE or custom VMs Messaging
  31. Storage options on App Engine Google Cloud Datastore Managed noSQL

    storage Unlimited scale Limited query capabilities Entities < 1MB Google Cloud Storage Store big files in the cloud Reliable storage Encrypted at rest Resumable uploads / downloads using HTTP Cheaper if needed: DRA, Nearline Storage (1c per GB/month)
  32. More storage options on App Engine Google Cloud SQL Managed

    plain ol’ MySQL Max database size is 500GB BigQuery Blazing fast analytics and reporting Scales indefinitely (though you may want to break data in chunks for cost) Can be used via API, command line or web interface Cloud Storage Cloud Datastore
  33. Even more storage options on App Engine Google Compute Engine

    · Deploy your own storage solution using persistent disks: PostgreSQL, Redis, MongoDB, etc. · Some of these are available using a preconfigured stack that can be deployed with a single click. · Choose the type of storage: Standard Persistent Disks, SSD Persistent Disks, Local SSD Disks (upcoming). · Choose size: bigger is faster. Cloud Storage Cloud Datastore Cloud SQL BigQuery
  34. Compute Engine Storage Types Zone-based resource Can act as a

    boot disk Ideal for random IOPS Zone-based resource Can act as a boot disk Ideal for bulk storage Standard Disk Standard Disks Attached to host Available in select regions Can’t act as a boot disk High IOPS & low latency Global /regional resource Use gsutil or API Can’t act as a boot device Block store SSD Block store Block store Cloud Storage Object store Local SSD (beta) SSD Local SSD
  35. Yet Even more storage options on App Engine Google Drive

    API · Store your data in rows using Google Spreadsheets · Store files in Google Drive Cloud Storage Cloud Datastore Cloud SQL BigQuery Compute Engine For Android · Cloud Save: Save and load a small amount of data for each user (4 x 256KB) · Saved Games: Like Cloud Save for games (since Jul 2014). Includes a default UI, and counts against the Drive quota of the user.
  36. Storage options on App Engine Cloud Storage Cloud Datastore Cloud

    SQL BigQuery Compute Engine Drive Android
  37. Storage options on App Engine Compute Engine Cloud Storage Cloud

    Datastore Cloud SQL BigQuery Compute Engine Drive Android
  38. Caching options on App Engine Memcache Shared or dedicated Maximum

    size of an entity is approx. 1MB Up to 20GB (dedicated) Roll your own cache service (e.g. Redis) Cached entities can be up to 512MB in size More features: sorted sets, queries, pub/sub, etc. Flexible configuration: persist to disk, max. memory, eviction policy, etc.
  39. Caching options on App Engine (2) Edge cache A CDN

    distributed around the world No configuration needed Will try to honor Cache-Control headers Available for App Engine and Google Cloud Storage PageSpeed Create sprites, inline JS, concatenate CSS, minify, resize/recompress images… Available as an App Engine Service and a Nginx / Apache module.
  40. HTTP 2.0 (based on SPDY) Enabled out-of-the-box on App Engine

    You don’t have to do anything Supported in other environments Included with the latest Nginx Adding mod_spdy with Apache · Up to 50% reduction in Page Load Time by reducing network latency · Requires SSL (in practice) and is backwards-compatible · Supported in all browsers (even Explorer) · Learn if you are already supporting it: http://spdycheck.org/ · Check the speed difference: https://www.httpvshttps.com/
  41. The HTTP Archive Introduced in 1996 Registers the Alexa Top

    1,000,000 Sites About 400GB of raw CSV data That’s answers to a lot of questions
  42. Which sites are using Prototype and jQuery today? * as

    of June 1 2013 (not really today)
  43. How do we know that? SELECT pages.pageid, url, cnt, libs,

    pages.rank rank FROM [httparchive:runs.2013_06_01_pages] as pages JOIN ( SELECT pageid, count(distinct(type)) cnt, GROUP_CONCAT(type) libs FROM ( SELECT REGEXP_EXTRACT(url, r'(jquery|prototype).*\.js') type, pageid FROM [httparchive:runs.2013_06_01_requests] WHERE REGEXP_MATCH(url, r'jquery|prototype.*\.js') GROUP BY pageid, type ) GROUP BY pageid HAVING cnt >= 2 ) as lib ON lib.pageid = pages.pageid WHERE rank IS NOT NULL ORDER BY rank asc We have a query to prove it Source: http://www.igvita.com/2013/06/20/http-archive-bigquery-web-performance-answers/
  44. Storage Cloud Storage Cloud SQL Cloud Datastore Compute Compute Engine

    (IaaS) App Engine (PaaS) Services BigQuery Cloud Endpoints Google Cloud Platform Big Data analysis tool
  45. Google BigQuery Analyze terabytes of data in seconds Data imported

    in bulk as CSV or JSON Supports streaming up to 100K updates/sec per table Use the browser tool, the command-line tool or REST API
  46. BigQuery is a prototyping tool Answers questions that you need

    to ask once in your life. Has a flexible interface to launch queries interactively, thinking on your feet. Processes terabytes of data in seconds. It’s much cheaper than the alternative.
  47. What are the top 100 most active Ruby repositories on

    GitHub? SELECT repository_name, count(repository_name) as pushes, repository_description, repository_url FROM [githubarchive:github.timeline] WHERE type="PushEvent" AND repository_language="Ruby" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC('2012-04-01 00:00:00') GROUP BY repository_name, repository_description, repository_url ORDER BY pushes DESC LIMIT 100 Source: http://bigqueri.es/t/what-are-the-top-100-most-active-ruby-repositories-on-github/9
  48. Much more flexible than SQL Multi-valued attributes lived_in: [ {

    city: ‘La Laguna’, since: ‘19752903’ }, { city: ‘Madrid’, since: ‘20010101’ }, { city: ‘Cologne’, since: ‘20130401’ } ] Correlation and nth percentile SELECT CORR(temperature, number_of_people) Data manipulation: dates, urls, regex, IP...
  49. What are the top cities contributing modifications to Wikipedia? SELECT

    COUNT(*) c, city, countryLabel, NTH(1, latitude) lat, NTH(1, longitude) lng FROM ( SELECT INTEGER(PARSE_IP(contributor_ip)) AS clientIpNum, INTEGER(PARSE_IP(contributor_ip)/(256*256)) AS classB FROM [publicdata:samples.wikipedia] WHERE contributor_ip IS NOT NULL ) AS a JOIN EACH [fh-bigquery:geocode.geolite_city_bq_b2b] AS b ON a.classB = b.classB WHERE a.clientIpNum BETWEEN b.startIpNum AND b.endIpNum AND city != '' GROUP BY city, countryLabel ORDER BY 1 DESC Source: Geoip geolocation with Google BigQuery
  50. Wrap-up: Google Cloud Platform Storage Cloud Storage Cloud SQL Cloud

    Datastore Compute Compute Engine (IaaS) App Engine (PaaS) Services BigQuery Cloud Endpoints
  51. • The Official Google Cloud Platform Course with completion certification

    exam emitted by Google. • 5 days of the best training: · App Engine, Compute Engine, Cloud Storage, Cloud SQL and BigQuery • Upcoming: Madrid and Barcelona 2015 • Register here. CP300
  52. Q? A! Nacho Coloma — CTO at Extrema Sistemas Google

    Developer Expert [email protected] @nachocoloma Jon Lorenzo - Responsable de Cloud Platform para Iberia Google [email protected] @jonlorsan