$30 off During Our Annual Pro Sale. View Details »

Store JSON in Cassandra the Hard Way

Store JSON in Cassandra the Hard Way

Keen IO stores events in the same format that our customers send them in—JSON. Yet, Keen uses Cassandra, a distributed database without any JSON primitives, and Keen gives customers the ability to query over arbitrary (even nested) JSON dimensions. How can this be???

Josh Dzielak

July 23, 2014
Tweet

More Decks by Josh Dzielak

Other Decks in Technology

Transcript

  1. View Slide

  2. JSON?

    View Slide

  3. wat
    JSON?

    View Slide

  4. Cassandra does not have a
    JSON data type

    View Slide

  5. Cassandra does not have a
    JSON data type
    What do we do?

    View Slide

  6. {
    "id": “c645-abd3“,
    "timestamp": "2013-08-06T12:30:00-0700Z",
    "url": "https://keen.io",
    "method": "GET",
    "response": {
    "status": 200
    }
    }
    Example JSON Event
    Models an HTTP request

    View Slide

  7. Use static column families?

    View Slide

  8. Use static column families?
    CREATE TABLE requests (
    KEY uuid PRIMARY KEY,
    timestamp text,
    url text,
    response_status integer,

    )

    View Slide

  9. Use static column families?
    • Possible if schema is known in advance
    CREATE TABLE requests (
    KEY uuid PRIMARY KEY,
    timestamp text,
    url text,
    response_status integer,

    )

    View Slide

  10. Use static column families?
    • Possible if schema is known in advance
    • Still might not be fast to scan +1M / +1B rows
    CREATE TABLE requests (
    KEY uuid PRIMARY KEY,
    timestamp text,
    url text,
    response_status integer,

    )

    View Slide

  11. Use static column families?
    • Possible if schema is known in advance
    • Still might not be fast to scan +1M / +1B rows
    • Not possible with Keen - schema is implicitly defined by
    customers, and not enforced at write-time
    CREATE TABLE requests (
    KEY uuid PRIMARY KEY,
    timestamp text,
    url text,
    response_status integer,

    )

    View Slide

  12. Use static column families?
    • Possible if schema is known in advance
    • Still might not be fast to scan +1M / +1B rows
    • Not possible with Keen - schema is implicitly defined by
    customers, and not enforced at write-time
    • Keen has 500,000+ user-defined collections today,
    stored in 1 Cassandra column family
    CREATE TABLE requests (
    KEY uuid PRIMARY KEY,
    timestamp text,
    url text,
    response_status integer,

    )

    View Slide


  13. View Slide

  14. 3 Perfectly Plausible Attempts
    to Store JSON in Cassandra

    View Slide

  15. 3 Perfectly Plausible Attempts
    to Store JSON in Cassandra
    And The 1 That Worked

    View Slide

  16. requests
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 400 } . . .
    Row Key UTF8Type
    User’s collection name (w/ Project ID prefix, omitted for brevity)
    !
    Column Name Composite(UTF8Type, TimeUUID)
    A unique UUID for the event, plus a TimeUUID
    CF: events

    View Slide

  17. Good Simple write path, can filter by timeframe
    CF: events
    requests
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 400 } . . .

    View Slide

  18. Good Simple write path, can filter by timeframe
    •Row growth is unbounded; hot spots for big collections
    •Can only filter on time by > or <, not both
    •Little gains from parallelizing b/c row is the same
    •JSON deserialization in queries is expensive
    Bad
    CF: events
    requests
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 400 } . . .

    View Slide

  19. requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .
    CF: events

    View Slide

  20. • Turned ‘requests’ into ‘requests-[0…n]’
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .
    CF: events

    View Slide

  21. • Turned ‘requests’ into ‘requests-[0…n]’
    • No more unbounded row growth
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .
    CF: events

    View Slide

  22. • Turned ‘requests’ into ‘requests-[0…n]’
    • No more unbounded row growth
    •Obvious, effective way to parallelize queries
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .
    CF: events

    View Slide

  23. • Turned ‘requests’ into ‘requests-[0…n]’
    • No more unbounded row growth
    •Obvious, effective way to parallelize queries
    •Required another CF, an ‘index’ to store pointers to these
    ‘buckets’:
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .
    requests
    0 1 . . .
    { “start” : “2014-07-21” … } { “start” : “2014-07-22” … } . . .
    CF: index
    CF: events

    View Slide

  24. Concerns
    CF: events
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .

    View Slide

  25. •Write path has become more complex; position of
    bucket index must be kept in another CF and rolled over
    Concerns
    CF: events
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .

    View Slide

  26. •Write path has become more complex; position of
    bucket index must be kept in another CF and rolled over
    •Keeping consistent state for more than 1 CF 

    (solution: atomic batch)
    Concerns
    CF: events
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .

    View Slide

  27. •Write path has become more complex; position of
    bucket index must be kept in another CF and rolled over
    •Keeping consistent state for more than 1 CF 

    (solution: atomic batch)
    •Still have to deserialize JSON at query time
    Concerns
    CF: events
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .

    View Slide

  28. •Write path has become more complex; position of
    bucket index must be kept in another CF and rolled over
    •Keeping consistent state for more than 1 CF 

    (solution: atomic batch)
    •Still have to deserialize JSON at query time
    •Still not fast enough
    Concerns
    CF: events
    requests-0
    cef7-be80:TimeUUID() a87b-472c:TimeUUID() . . .
    { “status” : 200 } { “status” : 304 } . . .
    requests-1
    9f45-bf97:TimeUUID() 76ab-dca6:TimeUUID() . . .
    { “status” : 417 } { “status” : 200 } . . .
    requests-n . . . . . . . . .

    View Slide

  29. We are here

    View Slide

  30. We are here Why?

    View Slide

  31. Our assumptions about Cassandra
    were not valid.

    View Slide

  32. Our assumptions about Cassandra
    were not valid.
    Namely, that it would do most of the work for us.

    View Slide

  33. View Slide

  34. - Dieter Rams

    View Slide

  35. The Epiphany

    View Slide

  36. The Epiphany
    Don’t let the physical data model dictate
    the logical data model.

    View Slide

  37. The Epiphany
    Don’t let the physical data model dictate
    the logical data model.
    Define an ideal logical model first. Then find a way
    to implement it using a physical data store.

    View Slide

  38. The Epiphany
    Don’t let the physical data model dictate
    the logical data model.
    Define an ideal logical model first. Then find a way
    to implement it using a physical data store.
    This was not obvious coming from the relational
    or document store worlds.

    View Slide

  39. The Epiphany
    Don’t let the physical data model dictate
    the logical data model.
    Define an ideal logical model first. Then find a way
    to implement it using a physical data store.
    This was not obvious coming from the relational
    or document store worlds.
    It felt dirty. But it doesn’t have to be.

    View Slide

  40. Don’t store JSON as JSON!
    The Epiphany

    View Slide

  41. requests-0
    timestamp response.status url
    [“2014-07-21…”,
    “2014-07-21…”]
    [200,
    400]
    [“keen.io”,
    “keen.io”]
    requests-1
    timestamp response.status url
    [“2014-07-21…”,
    “2014-07-21…”]
    [200,
    400]
    [“keen.io”,
    “keen.io”]
    Row Key UTF8Type
    User’s collection name with a ‘bucket’ sequence number
    !
    Column Name UTF8Type
    The dotted.name of the property
    !
    Column Value BytesType
    A Kryo-serialized, compressed Object[]
    containing ~5000 property values
    CF: events

    View Slide

  42. requests-0
    timestamp response.status url
    [“2014-07-21…”,
    “2014-07-21…”]
    [200,
    400]
    [“keen.io”,
    “keen.io”]
    requests-1
    timestamp response.status url
    [“2014-07-21…”,
    “2014-07-21…”]
    [200,
    400]
    [“keen.io”,
    “keen.io”]
    Notes
    Order of properties in the Object[] arrays is *very* important!
    !
    timestamp[5], response.status[5] and url[5] must be
    properties from the same event
    CF: events

    View Slide

  43. Super-fast queries due to:
    • columnar access
    • columnar compression
    • Kryo deserialization (fast!)
    • parallelism via bucketing
    The Good

    View Slide

  44. Super-fast queries due to:
    • columnar access
    • columnar compression
    • Kryo deserialization (fast!)
    • parallelism via bucketing
    Tradeoffs
    • Application code is more complicated.
    • Writes and reads must understand this structure.
    The Good

    View Slide

  45. Super-fast queries due to:
    • columnar access
    • columnar compression
    • Kryo deserialization (fast!)
    • parallelism via bucketing
    Tradeoffs
    • Application code is more complicated.
    • Writes and reads must understand this structure.
    The Good
    Worth it!

    View Slide

  46. Whee

    View Slide

  47. Lessons Learned
    • Set performance targets early on
    !
    • Design a logical data model first with the
    characteristics you need (columnar,
    compressible, partition-able), then figure out
    how to project it onto physical storage.
    !
    • Don’t be afraid to try crazy stuff that would feel
    unnatural in the relational or document worlds

    View Slide

  48. 3 Awesome Things That
    Happened In Production
    That Were Not
    Awesome At The Time

    View Slide

  49. Doomstones
    A machine with corrupted RAM propagated
    future dated tombstones around the ring. This
    made data ‘invisible’ for number of row keys.

    View Slide

  50. Serialization
    Incompatibility
    Code change that changed serialization format
    made it into write path before read path. 

    Result: KryoException

    View Slide

  51. Clock Drift
    When ring clocks are not in sync “Last Write
    Wins” becomes “Any Write Wins”.

    View Slide

  52. Thanks!
    !
    Questions?
    Talk at my face
    !
    or email me at
    [email protected]

    View Slide