Cassandra @ Cayova

* @ Bill de hÓra, CTO Cayova Ireland Dublin Cassandra
Users 10th July 2013

Social service, public beta Personal data ownership & reward Web
& iOS; Android underway - who are we?

Feed: public timeline posting Chat: private and group messaging Hub:
group discussion and sharing Content: upload & share photo/video/music/ﬁles Box: personal web tracking and dashboard - what does it do?

Time series Semi-structured Data that exhibits ‘Growthiness’ What’s C* being
used for?

What’s C* being used for? Posts Chat Inbox Notiﬁcations Object
Metadata Counters Timelines Hashtags Browser Metrics System Events Likes

Elasticsearch Storm Cassandra MySQL Redis Kafka Zookeeper Storm Kafka Zookeeper
Nginx Tomcat Dropwizard Infrastructure Context

Why C*? Bypass “startup migrates off RDBMS” war story Excellent
robustness & availability Predictable scaling & cost model Domain and access pattern ﬁt In-house experience at scale Strong community

“ec2 killed one of the events nodes” Node loss

[11:40am] dehora: lol, ec2 killed one of the events nodes
[11:40am] dehora: 10.53.53.155 eu-west 1a Up Normal 895.21 MB 75.00% 0 [11:40am] dehora: 10.64.110.63 eu-west 1b Up Normal 851.05 MB 75.00% 42535295865117307932921825928971026432 [11:40am] dehora: 10.55.65.71 eu-west 1a Up Normal 892.55 MB 75.00% 85070591730234615865843651857942052864 [11:40am] dehora: 10.251.39.177 eu-west 1b Down Normal 430.73 MB 75.00% 127605887595351923798765477786913079296 [11:40am] matthew :O [11:40am] dehora: the last node’s instance doesn't exist anymore, but system’s ﬁne [11:41am] dehora: asg spun up a new node, but it has a random token so didn't autojoin [11:41am] matthew: eugh >:( [11:41am] matthew: should Priam not have handled that? [11:41am] dehora: yes, but it can't [11:42am] dehora: the ami we're using here has a bug/feature (apache .deb starts cassandra which means priam can't assign) [11:42am] dehora: the latest ami (0.2.3) has a ﬁx for that [11:45am] dehora: k, i'll remove that node and bring in a new one on c, done with testing 2 zone evac anyway Node loss - still 100% available

Node loss - post mortem

Node Setup Cassandra: 1.1.11, Apache .deb, JDK6 AWS: eu-west1, per
cluster LC/SG, 3 x AZ AMI: Parsel, derived from Ubuntu 12.04 base AMI Servers: m1.xlarge, 1.6T 4x eph RAID0, 8G ebs Conf: 8GB/800M heap, RF=3, 100M keycache, -rowcache Management: Priam, Graphite, Boundary, jmxtrans Client: Astyanax, Quorum, Metrics, TokenAware, Backoff

1.6T: 4x eph mdadm RAID0, XFS Cassandra 1.1.11 Priam/Tomcat S3
Parsel +Base AMI, Ubuntu 12.04 LTS Oracle JDK6 jmxtrans Graphite supervisord puppet agent m1.xlarge: Zoned, Static ASG, SG Astyanax clients bprobe Node Setup Graphite Boundary

repair Astyanax clients asg: 1a asg: 1b asg: 1c eu-west-1
ﬂush writes lc: cass_metrics_0001 load Cluster Setup!

@Override Keyspace get() {
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder() .forCluster(cluster) .forKeyspace(keyspace) .withAstyanaxConfiguration(createAstyanaxSettings()) .withConnectionPoolConfiguration(createConnectionPoolSettings()) .withConnectionPoolMonitor(new YammerConnectionPoolMonitor()) .buildKeyspace(ThriftFamilyFactory.getInstance()) context.start() addShutdownHook { context.shutdown() } context.entity } Astyanax Client (Groovy)

private AstyanaxConfigurationImpl createAstyanaxSettings() { new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE) .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE) .setAsyncExecutor(createExecutor()) .setClock(clock) .setDefaultReadConsistencyLevel(ConsistencyLevel.valueOf(defaultReadConsistencyLevel)) .setDefaultWriteConsistencyLevel(ConsistencyLevel.valueOf(defaultWriteConsistencyLevel)) .setRetryPolicy(new ExponentialBackoff(baseSleepTime, maxAttempts)) } Astyanax Client (Groovy)

Parsel login screen

Cassandra v Redis

Cassandra + Redis

Operating C*

General Guidance Pair deploy for destructive operations Automate as much
as possible & burn AMIs Use a management tool (DSE, OpsCenter, Priam) Set consistencylevel as QUORUM in the CLI Monitor growth in load Consider getting support Ask for help - mailing list, lots of community expertise

Watch all the things Repair/Compaction spikes: compactionstats Disk load: nodetool
info, du, iostat -x, backups off-node Write pressure: tpstats FlushWriter stage, memtables GC: grep GCI, top -H; we enable gc log in prod Baseline heap: phat nodes might want more than 8G Cache hits & resizing: nodetool info, warnings in logs Dropped/Pending: nodetool tpstats, cfhistograms Flapping: up/down log messages, client markdowns ಠ_ಠ

def repair(hostname, keyfile, username='', passwd=''):
with settings(warn_only=True): env.host_string = hostname env.key_filename = keyfile result = run("curl -‐v http://localhost:8080/Priam/REST/v1/cassadmin/repair") send_happy_mail(hostname, username, passwd) if result.failed: send_error_mail(hostname, username, passwd) abort("Failing Cassandra repair call on %s" % hostname) Centralise Repair (Fabric)

It’s all in the grind, Sizemore Understand row width and
data access Avoid heavy delete after writes (queues) Avoid read before writes (usually) Understand your client Model, don’t prototype RDBMS if you need a lock Redis/RDBMS if you need precision counters Allow time to relearn & get productive

Future Multi-regional deploys, SSD/hi1.4xlarge Usecases: Graph, Recording, Tags, Thumbs C*
1.2/2.0 Better disk density CQL #1311: Triggers #5062 : ConsistencyLevel.SERIAL #4865: Off heap Bloom ﬁlters

Thanks! https://cayova.com

Cassandra @ Cayova

Cassandra @ Cayova

Bill de hÓra

Other Decks in Technology

Featured

Transcript

* @ Bill de hÓra, CTO Cayova Ireland Dublin Cassandra

Social service, public beta Personal data ownership & reward Web

Feed: public timeline posting Chat: private and group messaging Hub:

Time series Semi-structured Data that exhibits ‘Growthiness’ What’s C* being

What’s C* being used for? Posts Chat Inbox Notiﬁcations Object

Elasticsearch Storm Cassandra MySQL Redis Kafka Zookeeper Storm Kafka Zookeeper

Why C*? Bypass “startup migrates off RDBMS” war story Excellent

“ec2 killed one of the events nodes” Node loss

[11:40am] dehora: lol, ec2 killed one of the events nodes

Node loss - post mortem

Node Setup Cassandra: 1.1.11, Apache .deb, JDK6 AWS: eu-west1, per

1.6T: 4x eph mdadm RAID0, XFS Cassandra 1.1.11 Priam/Tomcat S3

repair Astyanax clients asg: 1a asg: 1b asg: 1c eu-west-1

@Override Keyspace get() {

private AstyanaxConfigurationImpl createAstyanaxSettings() { new AstyanaxConfigurationImpl()

Parsel login screen

Cassandra v Redis

Cassandra + Redis

Operating C*

General Guidance Pair deploy for destructive operations Automate as much

Watch all the things Repair/Compaction spikes: compactionstats Disk load: nodetool

def repair(hostname, keyfile, username='', passwd=''):

It’s all in the grind, Sizemore Understand row width and

Future Multi-regional deploys, SSD/hi1.4xlarge Usecases: Graph, Recording, Tags, Thumbs C*

Thanks! https://cayova.com