+Kazunori Sato
@kazunori_279
Kaz Sato
Staff Developer Advocate,
Tech Lead for Data & Analytics,
Cloud Platform, Google Inc.
Slide 3
Slide 3 text
= The Datacenter as a Computer
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
Enterprise
Slide 6
Slide 6 text
Jupiter network
40GbE ports
10GbE x 100K = 1 Pbps
CLOS topology
Software Defined Network
Slide 7
Slide 7 text
Borg
Launches 2B containers / week
Manages 10K machines / Cell
DC-scale proactive job sched
(CPU, mem, disk IO, TCP ports)
Paxos-based metadata store
Slide 8
Slide 8 text
Google BigQuery
Slide 9
Slide 9 text
1 B
1 B 100 B 900 M
Slide 10
Slide 10 text
At Google, MapReduce is classic.
We use BigQuery.
Confidential & Proprietary
Google Cloud Platform 10
Slide 11
Slide 11 text
SELECT your_data FROM billions_of_rows
WHERE full_disk_scan_required = true;
Scanning 1 TB in 1 sec
with 5,000 - 10,000 disk spindles
Slide 12
Slide 12 text
BigQuery Analytic Service in the Cloud
BigQuery
Analyze Export
Import
How to use BigQuery?
Google
Analytics
ETL tools
Connectors
Google Cloud
BI tools and
Visualization
Google Cloud
Spreadsheets, R,
Hadoop
Slide 13
Slide 13 text
Blazingly Fast
Capable of scanning 10B rows in ~10 sec
Low Cost
Storage: $0.020 per GB per month
Queries: $5 per TB
Fully Managed
Use thousands of servers with zero-ops
SQL
Simple and Intuitive SQL with JS UDF
Benefits of BigQuery