Cloud Bigtable @ GCPUG Taipei #2

Google Cloud Bigtable Ian Lewis, Developer Advocate, Google

Agenda 1 2 Bigtable and HBase - 15 minutes Google
Cloud Bigtable - 15 minutes

Cloud Bigtable

“Organize the world’s information and make it universally accessible and
useful.” - Google’s Mission Statement

To organize big data... … you need a BIG database.

Thus The White Paper in 2006 • Jeff Dean and
Sanjay Ghemawat set out to figure out what this database looks like • And came up with...

2002 2004 2006 2008 2010 2012 Colossus MapReduce Spanner Bigtable
Dremel GFS Other Google Innovations 2013 2014 Dataflow Kubernetes

Bigtable Plus Hundreds of Internal Services Bigtable as Inspiration and
Applications within Google Google is not affiliated or endorsed by any of these companies. Apache HBase, Apache Cassandra and Apache Accumulo are trademarks are of The Apache Software Foundation. Hypertable is the trademark of Hypertable Inc.

9 The Bigtable Data Model Google Cloud Bigtable Bigtable (and
HBase)... • is a NoSQL (no-join) distributed key-value store, designed to scale-out • has only one index (the row-key) • supports atomic single-row transactions

Put, Increment, Append Bigtable Replication Full Scan, Map Reduce +Filters
Gets, Short Scan +Filters Bulk Import Low Latency High Throughput Bigtable Replication 10 Basic Functional Usage Google Cloud Bigtable

3 Generations - #1: Original Bigtable • Jeff and Sanjay
decided to build a database service that could scale linearly across thousands and thousands of commodity servers ◦ Systems will fail, retain performance at scale • Leave the traditional relational model to achieve goals • The first generation was about: ◦ Prototyping and build the service to do its first scaling ◦ Migrate initial applications to Bigtable ◦ Invent replication and first multi-tenant version of Bigtable ◦ Painful rediscovery

3 Generations - #2: Bigtable Stabilized • Not only analytics
- now web serving as well ◦ Making it very low latency and bringing in the 99th % of requests [this is a hard problem] • Perfecting the Bigtable service ◦ What is that: a multi-tenant shared service model for a single database on a common set of resources [this is a hard problem] ▪ Spikes in CPU happen quickly and reacting to abusive usage is difficult to do effectively ▪ Hard-capping leaves resources on the table, and you lose the agility and efficiency you were looking for

Other Neat Bigtable Innovations • Memory heavy clusters, especially if
we think we can get a pretty high cache hit rate with a modest increase in memory • Mixed media clusters - mixture of SSD + HDD storage and an ability to specify an affinity • Tabletserver failure - Target is recovery in 1 second or less rather than 10s of seconds or minutes = appears to customer as latency if at all • Effortless Bigtable replication either in multiple zones for higher availability or across the world for better latency

3 Generations - #3: Google Cloud Bigtable • Offered as
a fully-managed service, simplifying operations and management of applications • Cloud Bigtable allows developers to quickly build applications to an industry standard API with no need to focus on infrastructure • Simple pricing model with serve resources and storage resources separated • High performance, and low latency, and low cost, and little to no configuration

Cloud Bigtable Data API Data can be read from and
written to Cloud Bigtable through a RESTful or RPC-based data service layer. Typically this will be to serve data to applications, dashboards and other microservices. Streaming Data can be streamed in (written event by event) through a variety of popular stream processing frameworks. Batch Processing Data can be read from and written to Cloud Bigtable through batch processing systems (either MapReduce based or analytical). Often, summarized or newly calculated data is written back to Cloud Bigtable or to a downstream database. Review Typical Access Patterns

Interface/API: Standardized • Cloud Bigtable is compatible with the HBase
1.0+ API/Client • While HBase is a separate system from Bigtable we have close ties to the community • We like the community - lots of voices, moving together, reps from many major tech giants, very widely adopted • Semantics and operations are very similar ◦ Want it to be easy to understand, transition to, develop against • Release tools that work with Cloud Bigtable and HBase and vice versa ◦ Grow the whole community so that all benefit

Pricing Model: Simple • In Cloud Bigtable you can provision
and change the serving resources with a single button with a single per-hour pricing ◦ What are Bigtable nodes? ◦ This is just the raw compute power that makes up the serve path - separate from the persisted storage tier • You’re billed separately for the amount of storage you use of whatever medium you choose (SSD or HDD) • This makes it super simple to plan for your workload and understand what your costs are

Pricing Model Google Cloud Bigtable Bigtable nodes Each node will
deliver up to 10,000 QPS and 10 MB/s of throughput Cost per hour Minimum number of nodes per cluster $0.65 3 Storage SSD storage (GB/mo) HDD storage (GB/mo) (coming soon) $0.17 $0.026 On creation of a Bigtable cluster, customers provision throughput for their workload in the form of Bigtable nodes. Storage is charged on a per-use basis.

Create/Configure UI: Easy

Management: Easy • Who in the audience have used HBase
before? • Things you will not see in Cloud Bigtable: ◦ Compactions ◦ Pre-splitting ◦ Lots of configuration settings ◦ 1 minute regionserver outages ◦ Coprocessors (for now)

Financial Services Faster risk analysis, credit card fraud/abuse Marketing/ Digital
Media User engagement, clickstream analysis, real-time adaptive content Internet of Things Sensor data dashboards and anomaly detection Telecommunications Sampled traffic patterns, metric collection and reporting Energy Oil well sensors, anomaly detection, predictive modeling Biomedical Genomics sequencing data analysis Cloud Bigtable Use Cases

TLDR: Serious Machinery

Cloud Bigtable Roadmap • Integrations, Integrations, Integrations! • HDD Bigtable
(at 0.026 per GB-month) • Configurable automatic/manual replication • Additional clients • Throughput auto-scaling • Snapshots and restores • Report card ◦ No you’re not in trouble… well you may be.

Thank you! Ian Lewis Developer Advocate Google Cloud Platform [email protected]
@IanMLewis

Cloud Bigtable @ GCPUG Taipei #2

Cloud Bigtable @ GCPUG Taipei #2

Ian Lewis

More Decks by Ian Lewis

Other Decks in Technology

Featured

Transcript

Google Cloud Bigtable Ian Lewis, Developer Advocate, Google

Agenda 1 2 Bigtable and HBase - 15 minutes Google

Cloud Bigtable

“Organize the world’s information and make it universally accessible and

To organize big data... … you need a BIG database.

Thus The White Paper in 2006 • Jeff Dean and

2002 2004 2006 2008 2010 2012 Colossus MapReduce Spanner Bigtable

Bigtable Plus Hundreds of Internal Services Bigtable as Inspiration and

9 The Bigtable Data Model Google Cloud Bigtable Bigtable (and

Put, Increment, Append Bigtable Replication Full Scan, Map Reduce +Filters

3 Generations - #1: Original Bigtable • Jeff and Sanjay

3 Generations - #2: Bigtable Stabilized • Not only analytics

Other Neat Bigtable Innovations • Memory heavy clusters, especially if

3 Generations - #3: Google Cloud Bigtable • Offered as

Cloud Bigtable Data API Data can be read from and

Interface/API: Standardized • Cloud Bigtable is compatible with the HBase

Pricing Model: Simple • In Cloud Bigtable you can provision

Pricing Model Google Cloud Bigtable Bigtable nodes Each node will

Create/Configure UI: Easy

Management: Easy • Who in the audience have used HBase

Financial Services Faster risk analysis, credit card fraud/abuse Marketing/ Digital

TLDR: Serious Machinery

Cloud Bigtable Roadmap • Integrations, Integrations, Integrations! • HDD Bigtable

Thank you! Ian Lewis Developer Advocate Google Cloud Platform [email protected]