$30 off During Our Annual Pro Sale. View Details »

Deep Dive into Google Cloud Technology

Deep Dive into Google Cloud Technology

for YAPC Asia Tokyo 2015

GoogleCloudPlatformJapan

August 21, 2015
Tweet

More Decks by GoogleCloudPlatformJapan

Other Decks in Programming

Transcript

  1. Deep Dive into
    Google Cloud Technology
    YAPC Asia Tokyo 2015
    #yapcasia #yapcasiaD

    View Slide

  2. +Kazunori Sato
    @kazunori_279
    Kazunori Sato
    Developer Advocate,
    Cloud Platform, Google Inc.
    Cloud community advocacy
    Cloud product launch support

    View Slide

  3. is:

    View Slide

  4. View Slide

  5. Enterprise

    View Slide

  6. 2012 2015
    MapReduce
    Spanner
    2003 2006 2010 2011
    GFS
    Borg
    Colossus
    Dremel
    Bigtable
    Chubby
    2004

    View Slide

  7. View Slide

  8. Building what’s next 8

    View Slide

  9. The Google Cloud Technology
    Big Data
    Container
    1
    2
    3 Networking
    4 The Future

    View Slide

  10. Confidential & Proprietary
    Google Cloud Platform 10
    Big Data

    View Slide

  11. Confidential & Proprietary
    Google Cloud Platform 11
    1 B
    1 B 100 B 900 M

    View Slide

  12. At Google, MapReduce is classic.
    We use Dremel,
    FlumeJava and Millwheel.
    Confidential & Proprietary
    Google Cloud Platform 12

    View Slide

  13. Dremel
    MillWheel
    FlumeJava
    MapReduce
    2012 2013
    2002 2004 2006 2008 2010
    GFS
    The World Beyond MapReduce
    Cloud Dataflow
    BigQuery

    View Slide

  14. View Slide

  15. Confidential & Proprietary
    Google Cloud Platform 15
    Google BigQuery
    Demo: RegEx + GROUP BY on 100 B rows
    response read RegEx
    100 B
    shuffled
    278 GB
    ~10 sec 4 TB

    View Slide

  16. SELECT your_data FROM billions_of_rows
    WHERE full_disk_scan_required = true;
    Scanning 1 TB in 1 sec
    with 5,000 - 10,000 disk spindles

    View Slide

  17. Mixer 0
    Mixer 1 Mixer 1
    Shard Shard Shard Shard
    Colossus SELECT state, year
    COUNT(*)
    GROUP BY state
    WHERE year >= 1980 and year < 1990
    ORDER BY count_babies DESC
    LIMIT 10
    COUNT(*)
    GROUP BY state

    View Slide

  18. Dremel
    Shard
    Dremel
    Shard
    Dremel
    Shard
    Super fast Shuffling:
    1B x 1B JOIN in 30 sec. How?
    Dremel
    Shard
    Dremel
    Shard
    Dremel
    Shard
    ?

    View Slide

  19. But Dremel is not a silver bullet.
    What about complex batch
    or real-time stream processing?
    Confidential & Proprietary
    Google Cloud Platform 19

    View Slide

  20. View Slide

  21. View Slide

  22. Cloud Dataflow: FlumeJava + MillWheel
    Fully Managed &
    Optimized
    Super fast shuffling
    Exactly-Once pipeline
    Batch + Streaming

    View Slide

  23. Example: Autocomplete
    Tweets
    Predictions
    read #argentina scores, my #art project,
    watching #armenia vs #argentina
    ExtractTags #argentina #art #armenia #argentina
    Count (argentina, 5M) (art, 9M) (armenia, 2M)
    ExpandPrefixes
    a->(argentina,5M) ar->(argentina,5M)
    arg->(argentina,5M) ar->(art, 9M) ...
    Top(3)
    write
    a->[apple, art, argentina]
    ar->[art, argentina, armenia]
    .apply(TextIO.Read.from(...))
    .apply(ParDo.of(new ExtractTags()))
    .apply(Count.create())
    .apply(ParDo.of(new ExpandPrefixes())
    .apply(Top.largestPerKey(3)
    Pipeline p = Pipeline.create();
    p.begin();
    .apply(TextIO.Write.to(...));
    p.run()

    View Slide

  24. Streaming made easy
    Pipeline p = Pipeline.create(new PipelineOptions());
    p.begin()
    .apply(PubsubIO.Read.topic(“input_topic”))
    .apply(Window.into(SlidingWindows.of(
    Duration.standardMinutes(60)))
    .apply(ParDo.of(new ExtractTags()))
    .apply(Count.perElement())
    .apply(ParDo.of(new ExpandPrefixes())
    .apply(Top.largestPerKey(3))
    .apply(PubsubIO.Write.topic(“output_topic”));
    p.run();

    View Slide

  25. Confidential & Proprietary
    Google Cloud Platform 25
    Container

    View Slide

  26. Google confidential | Do not distribute
    Every Google service runs on Borg
    2 B containers every week

    View Slide

  27. View Slide

  28. Borg
    No VMs, pure containers
    Manages 10K machines / Cell
    DC-scale proactive job sched
    (CPU, mem, disk IO, TCP ports)
    Paxos-based metadata store

    View Slide

  29. one machine

    View Slide

  30. Google App Engine:
    Borg for Everyone

    View Slide

  31. Google confidential | Do not distribute
    start up in ~40 ms
    scales to 1 M req/s

    View Slide

  32. Kubelet Kubelet Kubelet Kubelet
    Kubernetes Master
    Replication
    Controller
    Scheduler
    API Server
    Kube-UI
    Kubernetes (k8s) 1.0
    is open source, Docker based
    Google
    Container Engine
    is a fully managed k8s

    View Slide

  33. Google Cloud Storage Nearline is cheap.
    because Disk IO (not space) is the cost.

    View Slide

  34. Confidential & Proprietary
    Google Cloud Platform 34
    Networking

    View Slide

  35. We build our network from scratch.

    View Slide

  36. 82 Tbps 1.3 Pbps

    View Slide

  37. View Slide

  38. Jupiter network
    40 G ports
    10 G x 100 K = 1 Pbps total
    CLOS topology
    Software Defined Network

    View Slide

  39. Inter-zone iperf speed: 9G bps
    Inter-region private network: by default
    Google Compute Engine

    View Slide

  40. GCE Load Balancer
    One Global IP, Multi-region LB/fail-over, 1 M req/s
    EU Asia
    US
    VMs VMs VMs
    11.22.33.44

    View Slide

  41. Confidential & Proprietary
    Google Cloud Platform 41
    The Future

    View Slide

  42. View Slide

  43. View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. Vision API for
    Google Play services

    View Slide

  48. Confidential & Proprietary
    Google Cloud Platform 48
    ...and More
    48

    View Slide

  49. The Google Cloud Technology: Summary
    Big Data: the World beyond MapReduce
    Container: from Borg to k8s
    1
    2
    3 Networking: Google is the Network
    4 The Future: is Now

    View Slide

  50. Thank you

    View Slide