Slide 1

Slide 1 text

NoSQL Databases at Stackdriver Choose carefully and re-evaluate often Patrick R. Eaton, PhD [email protected] @PatrickREaton Joey Imbasciano [email protected] @_joeyi

Slide 2

Slide 2 text

● Stackdriver provides an intelligent monitoring service ● Acquire billions of time series data points per day ● Must write data at wire speeds ● Read slices of data for graphing and analysis ● Also write various aggregations and summarizations Why We Need a Database

Slide 3

Slide 3 text

We Chose Cassandra Key Cassandra features ● True P2P architecture ● Replicates data across fault domains ● EC2-aware data placement strategies ● Good support for write-heavy workloads ● Compatible data model for time series data ● Automatic data expiration with TTLs Why not MySQL? ● Relational data model not a good match ● Experience with operating large, sharded deployments Why not HBase? ● Operational complexity - zk, Hadoop, HDFS, ... ● Special "master" role Why not Dynamo? ● Avoid vendor lock-in and high cost

Slide 4

Slide 4 text

Cassandra at Stackdriver Usage ● Primary: 15 TB of Data Online, 50k+ writes/s ● (Alerting: 1 GB of Data Online, 700 writes/s) EC2 node configuration ● m1.xlarge instances ○ 8 ECUs (4 Cores x 2 ECUs), 15 GB RAM ○ 4 spinning disks via mdadm RAID-0 ○ 1.7TB of available storage per node Cassandra Configuration ● 36 nodes ● Ec2Snitch (Availability Zone Aware) ● Replication Factor: 3 ● Vnodes ● Cost = ~$12,500/month

Slide 5

Slide 5 text

Growing Cassandra in AWS 1 us-east-1a us-east-1c us-east-1b us-east-1a 3 us-east-1c 2 us-east-1b Where we started… Where we are...

Slide 6

Slide 6 text

Automation in AWS ● Combination of Boto, Fabric, & Puppet ○ Boto for AWS API ○ Fabric + Puppet for bootstrapping ○ Fabric for operations ● One CLI tool ○ Launch a new cluster ○ Upsize a cluster ○ Replace a dead node ○ Remove existing nodes ○ List nodes in a cluster

Slide 7

Slide 7 text

Increasing Compute Demand

Slide 8

Slide 8 text

Increasing Storage Demand

Slide 9

Slide 9 text

Benchmarking Options See http://www.stackdriver.com/cassandra-aws-gce-rackspace/

Slide 10

Slide 10 text

Today: Next Phase of Scale Option 1: Upsize cluster from 36 nodes -> 48 nodes ● Total cost: $16,500 / month (vs. $12,500 currently) ● Pros: known configuration, grows existing cluster ● Cons: more nodes, more problems ● Bootstrapping takes day(s) Option 2: Build new cluster using 9 hi1.4xlarge ● Total cost: $20,000 / month ● 4x compute, 4x memory, SSD vs. spinning rust ● Everybody is doing it (Netflix, Instagram) ● Pros: less nodes, less problems ● SSDs removes I/O bottleneck for compaction ● Faster reads ● Cons: unknown configuration, requires data reload

Slide 11

Slide 11 text

Dynamo as an Alternative? Pros ● Hosted ● Automatic tuning ● Automatic upgrades ● Full-time operations ● “Infinitely” scalable ● Automatic scaling ● Likely decreasing costs ● AWS has a history of aggressively reducing prices ● Last Dynamo price reduction March, 2013 Cons ● Vendor lock-in ● Complicated cost model ● Based on “write units” and “read units” ● Request rate, data size, consistency model ● No organizational experience ● Must endure growing pains of new service adoption ● No TTL for data ● Impacts costs ● Efficient data deletion requires engineering investment

Slide 12

Slide 12 text

Dynamo Versus Cassandra Costs Cassandra Costs ● Ongoing management = ¼ engineer, ~$3000/month ● Primary cluster - ~5TB data, ~45k w/s ● 36 m1.xlarge @ $0.48/hr = ~$12,500/month ● Alerting cluster - ~1GB data, ~700 w/s, ~2500 r/s ● 3 c3.2xlarge @ $0.60/hr = ~$1300/month Dynamo costs ● Ongoing management = ~$0/month ● Primary cluster ○ Total = ~$22,400 + reads, without reserved capacity ■ storage = ~$1380/month, writes = ~$21,000/month, reads = ??? ● Alerting cluster ○ Total: ~$600, without reserved capacity (or $475 for eventually consistent reads) ■ storage = ~$0, writes = ~$350, reads = ~$250 ● Save ~53% for 1-year reserved capacity, ~76% on 3-year

Slide 13

Slide 13 text

Stackdriver http://www.stackdriver.com @stackdriver Thank You! Patrick R. Eaton, PhD [email protected] @PatrickREaton Joey Imbasciano [email protected] @_joeyi