How we stored JSON in Cassan

Store JSON in Cassandra the Hard Way Josh Dzielak @dzello
A Personal Data Modeling Journey 07/23/2014 #CassandraSF

I’m Josh

I’m Josh Formerly

Our Hosts

This talk is about data modeling row key column name
column name . . . column value column value . . .

This talk is also about

A personal

intense

rewarding

Journey

designing a new, complex distributed system

out of numerous other quasi-documented heterogenous distributed systems.

The Hype Cycle of New Technologies

Hype Cycle Example - WWW

Hype Cycle Example - WWW Peak: Democratized information for All…
World Peace!

Hype Cycle Example - WWW Trough: Slow Connectivity / Walled
Gardens

Hype Cycle Example - WWW Slope: Broadband / Web 1.0,
  Web 2.0 + Social

Hype Cycle Example - WWW Today: The Web is Stable,
Productive Platform

The Hype Cycle Applies to Projects Too

Peak of Inﬂated Expectations

Peak of Inﬂated Expectations Shiny Egg!

Peak of Inﬂated Expectations • Faberge egg is gorgeous

Peak of Inﬂated Expectations • Faberge egg is gorgeous •
Prototype is gorgeous

Trough of Disillusionment

• Faberge egg is fragile Trough of Disillusionment

• Faberge egg is fragile • Prototype is fragile Trough
of Disillusionment

Trough of Disillusionment

Trough of Disillusionment Rabbit Hole < …

Trough of Disillusionment Sink Hole < …

Trough of Disillusionment Zombie Hole!

Deep Breaths

Drawing Board

Slope of Enlightenment

Slope of Enlightenment Starts with…

Slope of Enlightenment An moment!

Productivity

Productivity Stable predictable progress

Hype Cycle In Photos

Hype Cycle in Code

Hype Cycle in Code Peak

Hype Cycle in Code Peak Trough

Hype Cycle in Code Peak Trough Slope

Hype Cycle in Code Peak Trough Slope Productivity!

Hype Cycle Commit Messages

Hype Cycle Commit Messages Sorry

The Hype Cycle applies to building distributed systems

The Hype Cycle applies to building distributed systems Like the
one at Keen IO

Keen IO

Keen IO Analytics API for Developers

Keen IO Analytics API for Developers <3

2012 - 2013 •Speed to market •Great for JSON •Strong
Community Keen IO

•Linear Scale •Operational Story •Strong Community 2013 … ? Keen
IO

But wait

wat JSON?

Cassandra does not have a JSON data type

Cassandra does not have a JSON data type What do
we do?

{ "id": “c645-abd3“, "timestamp": "2013-08-06T12:30:00-0700Z", "url": "https://keen.io", "method": "GET", "response":
{ "status": 200 } } Example JSON Event Models an HTTP request

Use static column families?

Use static column families? CREATE TABLE requests ( KEY uuid
PRIMARY KEY, timestamp text, url text, response_status integer, … )

Use static column families? • Possible if schema is known
in advance CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )

in advance • Still might not be fast to scan +1M / +1B rows CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )

in advance • Still might not be fast to scan +1M / +1B rows • Not possible with Keen - schema is implicitly deﬁned by customers, and not enforced at write-time CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )

in advance • Still might not be fast to scan +1M / +1B rows • Not possible with Keen - schema is implicitly deﬁned by customers, and not enforced at write-time • Keen has 500,000+ user-deﬁned collections today, stored in 1 Cassandra column family CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )

3 Perfectly Plausible Attempts to Store JSON in Cassandra

3 Perfectly Plausible Attempts to Store JSON in Cassandra And
The 1 That Worked

requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200
} { “status” : 400 } . . . Row Key UTF8Type User’s collection name (w/ Project ID preﬁx, omitted for brevity) ! Column Name Composite(UTF8Type, TimeUUID) A unique UUID for the event, plus a TimeUUID CF: events

Good Simple write path, can ﬁlter by timeframe CF: events
requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 400 } . . .

Good Simple write path, can ﬁlter by timeframe •Row growth
is unbounded; hot spots for big collections •Can only ﬁlter on time by > or <, not both •Little gains from parallelizing b/c row is the same •JSON deserialization in queries is expensive Bad CF: events requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 400 } . . .

requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200
} { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events

• Turned ‘requests’ into ‘requests-[0…n]’ requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . .
. { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events

• Turned ‘requests’ into ‘requests-[0…n]’ • No more unbounded row
growth requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events

growth •Obvious, eﬀective way to parallelize queries requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events

growth •Obvious, eﬀective way to parallelize queries •Required another CF, an ‘index’ to store pointers to these ‘buckets’: requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . requests 0 1 . . . { “start” : “2014-07-21” … } { “start” : “2014-07-22” … } . . . CF: index CF: events

Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . {
“status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .

•Write path has become more complex; position of bucket index
must be kept in another CF and rolled over Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .

must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF   (solution: atomic batch) Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .

must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF   (solution: atomic batch) •Still have to deserialize JSON at query time Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .

must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF   (solution: atomic batch) •Still have to deserialize JSON at query time •Still not fast enough Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .

We are here

We are here Why?

Our assumptions about Cassandra were not valid.

Our assumptions about Cassandra were not valid. Namely, that it
would do most of the work for us.

- Dieter Rams

The Epiphany

The Epiphany Don’t let the physical data model dictate the
logical data model.

logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store.

logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store. This was not obvious coming from the relational or document store worlds.

logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store. This was not obvious coming from the relational or document store worlds. It felt dirty. But it doesn’t have to be.

Don’t store JSON as JSON! The Epiphany

requests-0 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”]
requests-1 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”] Row Key UTF8Type User’s collection name with a ‘bucket’ sequence number ! Column Name UTF8Type The dotted.name of the property ! Column Value BytesType A Kryo-serialized, compressed Object[] containing ~5000 property values CF: events

requests-0 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”]
requests-1 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”] Notes Order of properties in the Object[] arrays is *very* important! ! timestamp[5], response.status[5] and url[5] must be properties from the same event CF: events

Super-fast queries due to: • columnar access • columnar compression
• Kryo deserialization (fast!) • parallelism via bucketing The Good

• Kryo deserialization (fast!) • parallelism via bucketing Tradeoﬀs • Application code is more complicated. • Writes and reads must understand this structure. The Good

• Kryo deserialization (fast!) • parallelism via bucketing Tradeoﬀs • Application code is more complicated. • Writes and reads must understand this structure. The Good Worth it!

Lessons Learned • Set performance targets early on ! •
Design a logical data model first with the characteristics you need (columnar, compressible, partition-able), then figure out how to project it onto physical storage. ! • Don’t be afraid to try crazy stuff that would feel unnatural in the relational or document worlds

3 Awesome Things That Happened In Production That Were Not
Awesome At The Time

Doomstones A machine with corrupted RAM propagated future dated tombstones
around the ring. This made data ‘invisible’ for number of row keys.

Serialization Incompatibility Code change that changed serialization format made it
into write path before read path.   Result: KryoException

Clock Drift When ring clocks are not in sync “Last
Write Wins” becomes “Any Write Wins”.

Thanks! ! Questions? Talk at my face ! or email
me at [email protected]

How we stored JSON in Cassan

How we stored JSON in Cassan

More Decks by Keen

Featured

Transcript