$30 off During Our Annual Pro Sale. View Details »

BigQuery and the world after MapReduce

BigQuery and the world after MapReduce

Kazunori Sato

July 09, 2014
Tweet

More Decks by Kazunori Sato

Other Decks in Technology

Transcript

  1. BigQuery and the world
    after MapReduce
    Handling Big Data in the Googley way

    View Slide

  2. +Kazunori Sato
    @kazunori_279
    Solutions Architect,
    Cloud Platform GBU, Google Inc
    - GCP solutions design
    - Professional services for GCP
    - Docker/GCP meetups support

    View Slide

  3. “Big Data” at Google:
    How do we handle it

    View Slide

  4. At Google, we have “big” big data everywhere
    What if a Googler is asked:
    “Can you give me the list of top 20 Android apps installed in 2012?”

    View Slide

  5. At Google, we don’t use
    MapReduce for this
    We use Dremel
    = Google BigQuery
    SELECT
    top(appId, 20) AS app,
    count(*) AS count
    FROM installlog.2012
    ORDER BY
    count DESC
    It scans 68B rows in ~20 sec,
    No index used.

    View Slide

  6. Demo

    View Slide

  7. BigQuery is low cost
    Storage: $0.026 per GB per month
    Queries: $5 per TB

    View Slide

  8. Why is BigQuery
    so fast?

    View Slide

  9. Column Oriented Storage
    Record Oriented Storage Column Oriented Storage
    Less bandwidth, More compression

    View Slide

  10. Inside BQ: ColumnIO on Colossus
    Colossus File System
    - The next generation Google File System
    Chunk Servers
    - ColumnIO on CFS
    - Replicated to multiple DCs asynchronously
    - Mitigates Tail Latency, provides 99.9% availability

    View Slide

  11. select top(title), count(*)
    from publicdata:samples.wikipedia
    Massively Parallel Processing
    Scanning 1 TB in 1 sec
    takes 5,000 disks
    Each query runs on thousands of servers

    View Slide

  12. Fast aggregation by tree structure
    Mixer 0
    Mixer 1 Mixer 1
    Shard Shard Shard Shard
    ColumnIO on Colossus SELECT state, year
    COUNT(*)
    GROUP BY state
    WHERE year >= 1980 and year < 1990
    ORDER BY count_babies DESC
    LIMIT 10
    COUNT(*)
    GROUP BY state

    View Slide

  13. Inside BQ: Query Processing
    1. Mixer receives a query
    - Interprets the query
    - Maps tables to CFS files, requests to Shards
    2. Shards process the partial queries
    - Open the ColumnIO files
    - Reads the table, applies WHERE, GROUP BY
    3. Mixer aggregates the results
    - Applies ORDER BY, LIMIT

    View Slide

  14. Inside BQ: Small JOIN
    SELECT wiki.title
    FROM [publicdata:samples.wikipedia] AS wiki
    JOIN (
    SELECT word FROM [publicdata:samples.shakespeare]
    GROUP BY word
    ) AS shakes ON wiki.title = shakes.word;

    View Slide

  15. Small JOIN: executed with Broadcast JOIN
    - One table should be < 8MB, sent to every shard
    From: Google BigQuery Analytics
    Inside BQ: Small JOIN

    View Slide

  16. Inside BQ: Big JOIN
    SELECT wiki.title
    FROM [publicdata:samples.wikipedia] AS wiki
    JOIN EACH (
    SELECT word FROM [publicdata:samples.shakespeare]
    GROUP EACH BY word
    ) AS shakes ON wiki.title = shakes.word;

    View Slide

  17. Inside BQ: Big JOIN
    Big JOIN: executed with shuffling
    - Both tables can be > 8MB
    - BQ shuffler doesn’t sort; just hash partitioning
    From: Google BigQuery Analytics

    View Slide

  18. Demo

    View Slide

  19. Inside BQ: Big JOIN
    Big JOIN performance
    - JOIN 608M records x 38M records → 60 - 80 sec
    SELECT COUNT(*) FROM [bigquery-samples:wikimedia_pageviews.
    200801] as a
    JOIN EACH [bigquery-samples:wikimedia_pageviews.200802] as b
    ON a.title = b.title WHERE b.language = "ja"
    From: Google BigQuery Analytics

    View Slide

  20. BigQuery + Hadoop

    View Slide

  21. HDFS
    (optional)
    Work Nodes
    Work Nodes
    Master
    Node
    Work
    Nodes
    HDFS
    (optional)
    Name
    Node
    (optional)
    Local
    SSD
    PD
    SSD
    PD
    standard
    GCS
    Connector
    BigQuery
    Connector
    Datastore
    Connector
    Connectors
    bdutil orchestration
    Connectors for Hadoop

    View Slide

  22. BigQuery Connector for Hadoop
    // Set the job-level projectId.
    conf.set(BigQueryConfiguration.PROJECT_ID_KEY, projectId);
    // Configure input parameters.
    BigQueryConfiguration.configureBigQueryInput(conf,
    ”publicdata:samples.shakespeare”);
    // Set InputFormat.
    job.setInputFormatClass(BigQueryInputFormat.class);

    View Slide

  23. BigQuery Connector for Hadoop
    // Configure output parameters
    String outSchema =
    "[{'name': 'Word','type': 'STRING'},{'name':
    'Number','type': 'INTEGER'}]";
    BigQueryConfiguration.configureBigQueryOutput(
    conf, ”gcp_samples.output”, outSchema);
    // Set OutputFormat.
    job.setOutputFormatClass(BigQueryOutputFormat.class);

    View Slide

  24. Demo

    View Slide

  25. BigQuery + Fluentd

    View Slide

  26. BigQuery Streaming
    Low cost: $0.01 per
    100,000 rows
    Real time availability
    of data
    100,000 rows per
    second x tables

    View Slide

  27. Slideshare uses Fluentd for collecting logs from >500 servers.
    "We take full advantage of its extendable plugin architecture and use it as a message bus that collects data
    from hundreds of servers into multiple backend systems." Sylvain Kalache, Operations Engineer

    View Slide

  28. Why Fluentd? Because it’s super easy to use,
    and has extensive plugins written by active community.

    View Slide

  29. Now Fluentd logs can be imported to
    BigQuery really easy, ~1M rows/s

    View Slide

  30. Demo

    View Slide

  31. BigQuery +
    User Defined Function

    View Slide

  32. User Defined Function with JavaScript
    BigQuery User Defined Function by JavaScript:
    - takes input JSON as arguments
    - emits output JSON as results
    JS in SQL
    - can be written in place of tables/subqueries
    Coming Soon!

    View Slide

  33. for (var i = 0; i < row.hits.length - 1; i++) {
    if (row.hits[i].page.pagepath.indexOf('github_nested') != -1 &&
    row.hits[i + 1].page.pagepath.indexOf('github_timeline') != -1) {
    result.push({
    visitid: row.visitid,
    duration: row.hits[i + 1].time - row.hits[i].time
    });
    }
    }
    User Defined Function with JavaScript

    View Slide

  34. SELECT actor, repository_name, page_name
    FROM js(
    -- Base query.
    (SELECT actor, payload.pages.page_name, repository.name, type
    FROM [publicdata:samples.github_nested]
    WHERE payload.pages.page_name IS NOT NULL),
    -- Input fields from base query.
    actor, payload.pages.page_name, repository.name, type,
    -- Output schema of JS function.
    "[{name: 'actor', type: 'string'},
    {name: 'repository_name', type: 'string'},
    {name: 'page_name', type: 'string'}]",
    -- JS function.
    "function(r, emit) {
    var foundHome = false;
    var foundNotHome = null;
    for (var i = 0; i < r.payload.pages.length; i++) {
    if (r.payload.pages[i].page_name == 'Home') {
    foundHome = true;
    } else if (r.payload.pages[i].page_name) {
    foundNotHome = r.payload.pages[i].page_name;
    }
    }
    if (foundHome && foundNotHome) {
    emit({actor: r.actor, repository_name: r.repository.name, page_name:
    foundNotHome});
    }
    }")
    ORDER BY actor, repository_name, page_name
    User Defined Function with JavaScript
    Search GitHub repository for updates that
    change a 'Home' page

    View Slide

  35. The World after
    MapReduce

    View Slide

  36. From: Wired

    View Slide

  37. BigQuery
    engine
    BigQuery
    Abuse detection
    User interactions
    Streaming Batch
    User engagement analytics
    Cloud Pub/Sub
    ACL ACL
    Topic 2
    Business
    dashboard
    Data
    science
    tools
    Users
    Devs
    Data
    scientists
    Business
    App events ACL ACL
    Topic 1
    Storage Services
    Cloud
    Storage
    Cloud
    Datastore
    Cloud
    SQL
    Open Source
    bdutil orchestration
    Connectors
    Cloud Dataflow

    View Slide

  38. Cloud Pub/Sub
    ACL ACL
    Topic 2
    Consumer
    Consumer
    Consumer
    Publisher
    Publisher
    Publisher
    Publisher
    ACL ACL
    Topic 1
    Cloud Pub/Sub
    Coming Soon!

    View Slide

  39. Abuse detection
    User interactions
    Cloud Dataflow
    Streaming Batch
    User engagement analytics
    Google Cloud Dataflow
    Optimize
    Schedule
    Cloud Dataflow Coming Soon!

    View Slide

  40. Cloud Dataflow = Managed Flume + MillWheel on GCE

    View Slide

  41. Prefix Suggestions
    #ar #argentina, #arugularocks, #argylesocks
    #arg #argentina, #argylesocks, #argonauts
    #arge #argentina, #argentum, #argentine
    Example: Auto completing hashtags

    View Slide

  42. {a->[apple, art, argentina], ar->[art, argentina, armenia],...}
    Count
    ExpandPrefixes
    Top(3)
    Write
    Read
    ExtractTags
    {a->(argentina, 5M), a->(armenia, 2M), …, ar->
    (argentina, 5M), ar->(armenia, 2M), ...}
    {#argentina scores!, watching #armenia vs
    #argentina, my #art project, …}
    {argentina, armenia, argentina, art, ...}
    {argentina->5M, armenia->2M, art->90M, ...}
    Tweets
    Predictions

    View Slide

  43. Count
    ExpandPrefixes
    Top(3)
    Write
    Read
    ExtractTags
    Tweets
    Predictions
    Pipeline p = Pipeline.create();
    p.begin()
    p.run();
    .apply(ParDo.of(new ExtractTags()))
    .apply(Top.largestPerKey(3))
    .apply(Count.create())
    .apply(ParDo.of(new ExpandPrefixes())
    .apply(TextIO.Write.to(“gs://…”));
    .apply(TextIO.Read.from(“gs://…”))

    View Slide

  44. Automatically optimizes the
    execution plan DAG

    View Slide

  45. Let’s stream it!
    Google Cloud Pub/Sub for reads and writes.
    time
    #ar*
    rank
    game begins armenia wins!
    #argyle
    #armeniarocks
    Age out old data
    #argentinagoal

    View Slide

  46. Pipeline p = Pipeline.create();
    p.begin()
    .apply(PubsubIO.Read.from(“input_topic”))
    .apply(Bucket.by(SlidingWindows.of(60, MINUTES))
    .apply(ParDo.of(new ExtractTags()))
    .apply(Count.create())
    .apply(ParDo.of(new ExpandPrefixes())
    .apply(Top.largestPerKey(3))
    .apply(PubsubIO.Write.to(“output_topic”));
    p.run();
    The same code can be used for
    stream processing

    View Slide

  47. Demo at YouTube

    View Slide

  48. Questions?

    View Slide

  49. Backup

    View Slide

  50. How to analyze
    big data in real-time?

    View Slide

  51. Lambda Architecture is:
    A complementary pair of:
    - in-memory real-time processing
    - large HDD/SSD batch processing
    Proposed by Nathan Marz
    ex. Twitter Summingbird
    Slow, but large and persistent.
    Fast, but small and volatile.

    View Slide

  52. Norikra: an open source Complex Event Processing (CEP)
    Production use at LINE, the largest asian SNS with 400M users, for massive log analysis

    View Slide

  53. Real-time analysis on streaming data
    with in-memory continuous query

    View Slide

  54. Lambda Arch by BQ+Norikra

    View Slide

  55. Demo

    View Slide