Store JSON in Cassandra the Hard Way

Store JSON in Cassandra the Hard Way

Keen IO stores events in the same format that our customers send them in—JSON. Yet, Keen uses Cassandra, a distributed database without any JSON primitives, and Keen gives customers the ability to query over arbitrary (even nested) JSON dimensions. How can this be???

Aee8ace6215b362ce4524bfdfc4a718c?s=128

Josh Dzielak

July 23, 2014
Tweet

Transcript

  1. None
  2. JSON?

  3. wat JSON?

  4. Cassandra does not have a JSON data type

  5. Cassandra does not have a JSON data type What do

    we do?
  6. { "id": “c645-abd3“, "timestamp": "2013-08-06T12:30:00-0700Z", "url": "https://keen.io", "method": "GET", "response":

    { "status": 200 } } Example JSON Event Models an HTTP request
  7. Use static column families?

  8. Use static column families? CREATE TABLE requests ( KEY uuid

    PRIMARY KEY, timestamp text, url text, response_status integer, … )
  9. Use static column families? • Possible if schema is known

    in advance CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )
  10. Use static column families? • Possible if schema is known

    in advance • Still might not be fast to scan +1M / +1B rows CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )
  11. Use static column families? • Possible if schema is known

    in advance • Still might not be fast to scan +1M / +1B rows • Not possible with Keen - schema is implicitly defined by customers, and not enforced at write-time CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )
  12. Use static column families? • Possible if schema is known

    in advance • Still might not be fast to scan +1M / +1B rows • Not possible with Keen - schema is implicitly defined by customers, and not enforced at write-time • Keen has 500,000+ user-defined collections today, stored in 1 Cassandra column family CREATE TABLE requests ( KEY uuid PRIMARY KEY, timestamp text, url text, response_status integer, … )
  13. 3 Perfectly Plausible Attempts to Store JSON in Cassandra

  14. 3 Perfectly Plausible Attempts to Store JSON in Cassandra And

    The 1 That Worked
  15. requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200

    } { “status” : 400 } . . . Row Key UTF8Type User’s collection name (w/ Project ID prefix, omitted for brevity) ! Column Name Composite(UTF8Type, TimeUUID) A unique UUID for the event, plus a TimeUUID CF: events
  16. Good Simple write path, can filter by timeframe CF: events

    requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 400 } . . .
  17. Good Simple write path, can filter by timeframe •Row growth

    is unbounded; hot spots for big collections •Can only filter on time by > or <, not both •Little gains from parallelizing b/c row is the same •JSON deserialization in queries is expensive Bad CF: events requests cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 400 } . . .
  18. requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200

    } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events
  19. • Turned ‘requests’ into ‘requests-[0…n]’ requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . .

    . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events
  20. • Turned ‘requests’ into ‘requests-[0…n]’ • No more unbounded row

    growth requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events
  21. • Turned ‘requests’ into ‘requests-[0…n]’ • No more unbounded row

    growth •Obvious, effective way to parallelize queries requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . CF: events
  22. • Turned ‘requests’ into ‘requests-[0…n]’ • No more unbounded row

    growth •Obvious, effective way to parallelize queries •Required another CF, an ‘index’ to store pointers to these ‘buckets’: requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . . requests 0 1 . . . { “start” : “2014-07-21” … } { “start” : “2014-07-22” … } . . . CF: index CF: events
  23. Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . {

    “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .
  24. •Write path has become more complex; position of bucket index

    must be kept in another CF and rolled over Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .
  25. •Write path has become more complex; position of bucket index

    must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF 
 (solution: atomic batch) Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .
  26. •Write path has become more complex; position of bucket index

    must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF 
 (solution: atomic batch) •Still have to deserialize JSON at query time Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .
  27. •Write path has become more complex; position of bucket index

    must be kept in another CF and rolled over •Keeping consistent state for more than 1 CF 
 (solution: atomic batch) •Still have to deserialize JSON at query time •Still not fast enough Concerns CF: events requests-0 cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . . { “status” : 200 } { “status” : 304 } . . . requests-1 9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . . { “status” : 417 } { “status” : 200 } . . . requests-n . . . . . . . . .
  28. We are here

  29. We are here Why?

  30. Our assumptions about Cassandra were not valid.

  31. Our assumptions about Cassandra were not valid. Namely, that it

    would do most of the work for us.
  32. None
  33. - Dieter Rams

  34. The Epiphany

  35. The Epiphany Don’t let the physical data model dictate the

    logical data model.
  36. The Epiphany Don’t let the physical data model dictate the

    logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store.
  37. The Epiphany Don’t let the physical data model dictate the

    logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store. This was not obvious coming from the relational or document store worlds.
  38. The Epiphany Don’t let the physical data model dictate the

    logical data model. Define an ideal logical model first. Then find a way to implement it using a physical data store. This was not obvious coming from the relational or document store worlds. It felt dirty. But it doesn’t have to be.
  39. Don’t store JSON as JSON! The Epiphany

  40. requests-0 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”]

    requests-1 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”] Row Key UTF8Type User’s collection name with a ‘bucket’ sequence number ! Column Name UTF8Type The dotted.name of the property ! Column Value BytesType A Kryo-serialized, compressed Object[] containing ~5000 property values CF: events
  41. requests-0 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”]

    requests-1 timestamp response.status url [“2014-07-21…”, “2014-07-21…”] [200, 400] [“keen.io”, “keen.io”] Notes Order of properties in the Object[] arrays is *very* important! ! timestamp[5], response.status[5] and url[5] must be properties from the same event CF: events
  42. Super-fast queries due to: • columnar access • columnar compression

    • Kryo deserialization (fast!) • parallelism via bucketing The Good
  43. Super-fast queries due to: • columnar access • columnar compression

    • Kryo deserialization (fast!) • parallelism via bucketing Tradeoffs • Application code is more complicated. • Writes and reads must understand this structure. The Good
  44. Super-fast queries due to: • columnar access • columnar compression

    • Kryo deserialization (fast!) • parallelism via bucketing Tradeoffs • Application code is more complicated. • Writes and reads must understand this structure. The Good Worth it!
  45. Whee

  46. Lessons Learned • Set performance targets early on ! •

    Design a logical data model first with the characteristics you need (columnar, compressible, partition-able), then figure out how to project it onto physical storage. ! • Don’t be afraid to try crazy stuff that would feel unnatural in the relational or document worlds
  47. 3 Awesome Things That Happened In Production That Were Not

    Awesome At The Time
  48. Doomstones A machine with corrupted RAM propagated future dated tombstones

    around the ring. This made data ‘invisible’ for number of row keys.
  49. Serialization Incompatibility Code change that changed serialization format made it

    into write path before read path. 
 Result: KryoException
  50. Clock Drift When ring clocks are not in sync “Last

    Write Wins” becomes “Any Write Wins”.
  51. Thanks! ! Questions? Talk at my face ! or email

    me at josh@keen.io