NoSQL Databases at Stackdriver

NoSQL Databases at Stackdriver Choose carefully and re-evaluate often Patrick
R. Eaton, PhD [email protected] @PatrickREaton Joey Imbasciano [email protected] @_joeyi

• Stackdriver provides an intelligent monitoring service • Acquire billions
of time series data points per day • Must write data at wire speeds • Read slices of data for graphing and analysis • Also write various aggregations and summarizations Why We Need a Database

We Chose Cassandra Key Cassandra features • True P2P architecture
• Replicates data across fault domains • EC2-aware data placement strategies • Good support for write-heavy workloads • Compatible data model for time series data • Automatic data expiration with TTLs Why not MySQL? • Relational data model not a good match • Experience with operating large, sharded deployments Why not HBase? • Operational complexity - zk, Hadoop, HDFS, ... • Special "master" role Why not Dynamo? • Avoid vendor lock-in and high cost

Cassandra at Stackdriver Usage • Primary: 15 TB of Data
Online, 50k+ writes/s • (Alerting: 1 GB of Data Online, 700 writes/s) EC2 node configuration • m1.xlarge instances ◦ 8 ECUs (4 Cores x 2 ECUs), 15 GB RAM ◦ 4 spinning disks via mdadm RAID-0 ◦ 1.7TB of available storage per node Cassandra Configuration • 36 nodes • Ec2Snitch (Availability Zone Aware) • Replication Factor: 3 • Vnodes • Cost = ~$12,500/month

Growing Cassandra in AWS 1 us-east-1a us-east-1c us-east-1b us-east-1a 3
us-east-1c 2 us-east-1b Where we started… Where we are...

Automation in AWS • Combination of Boto, Fabric, & Puppet
◦ Boto for AWS API ◦ Fabric + Puppet for bootstrapping ◦ Fabric for operations • One CLI tool ◦ Launch a new cluster ◦ Upsize a cluster ◦ Replace a dead node ◦ Remove existing nodes ◦ List nodes in a cluster

Increasing Compute Demand

Increasing Storage Demand

Benchmarking Options See http://www.stackdriver.com/cassandra-aws-gce-rackspace/

Today: Next Phase of Scale Option 1: Upsize cluster from
36 nodes -> 48 nodes • Total cost: $16,500 / month (vs. $12,500 currently) • Pros: known configuration, grows existing cluster • Cons: more nodes, more problems • Bootstrapping takes day(s) Option 2: Build new cluster using 9 hi1.4xlarge • Total cost: $20,000 / month • 4x compute, 4x memory, SSD vs. spinning rust • Everybody is doing it (Netflix, Instagram) • Pros: less nodes, less problems • SSDs removes I/O bottleneck for compaction • Faster reads • Cons: unknown configuration, requires data reload

Dynamo as an Alternative? Pros • Hosted • Automatic tuning
• Automatic upgrades • Full-time operations • “Infinitely” scalable • Automatic scaling • Likely decreasing costs • AWS has a history of aggressively reducing prices • Last Dynamo price reduction March, 2013 Cons • Vendor lock-in • Complicated cost model • Based on “write units” and “read units” • Request rate, data size, consistency model • No organizational experience • Must endure growing pains of new service adoption • No TTL for data • Impacts costs • Efficient data deletion requires engineering investment

Dynamo Versus Cassandra Costs Cassandra Costs • Ongoing management =
¼ engineer, ~$3000/month • Primary cluster - ~5TB data, ~45k w/s • 36 m1.xlarge @ $0.48/hr = ~$12,500/month • Alerting cluster - ~1GB data, ~700 w/s, ~2500 r/s • 3 c3.2xlarge @ $0.60/hr = ~$1300/month Dynamo costs • Ongoing management = ~$0/month • Primary cluster ◦ Total = ~$22,400 + reads, without reserved capacity ▪ storage = ~$1380/month, writes = ~$21,000/month, reads = ??? • Alerting cluster ◦ Total: ~$600, without reserved capacity (or $475 for eventually consistent reads) ▪ storage = ~$0, writes = ~$350, reads = ~$250 • Save ~53% for 1-year reserved capacity, ~76% on 3-year

Stackdriver http://www.stackdriver.com @stackdriver Thank You! Patrick R. Eaton, PhD [email protected]
@PatrickREaton Joey Imbasciano [email protected] @_joeyi

NoSQL Databases at Stackdriver

NoSQL Databases at Stackdriver

Stackdriver

More Decks by Stackdriver

Other Decks in Technology

Featured

Transcript