Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Firehose storage at paper.li

Firehose storage at paper.li

A story on how paper.li ended up choosing cassandra for most storage needs, what type of data is stored in cassandra, and some real life examples.

Pierre-Yves Ritschard

July 25, 2012

More Decks by Pierre-Yves Ritschard

Other Decks in Technology


  1. A Quick introduction paper.li Baseline: “Curation platform enabling millions of

    users to become the publishers of their own daily newspapers.” Canonical use case: daily newspaper on an interest Feeds on a wide variety of “social media” sources @pyr Lead architect at Smallrivers, I like to build big stuff Long time involvement in distributed systems, scalability Recent FP and big data convert
  2. Storage workload Two write heavy pipelines Continuous source aggregation Edition

    publisher Random read workload Huge mass of somewhat popular newspapers
  3. Storage painpoints You know the drill Crazy cache warm up

    Impossible schema changes Somewhat manageable read-workload Unmanageable write workload
  4. Requirements Improving status quo I/O pressure release Constant time writes

    Limiting operations overhead Going further More metadata Analytics Adapting behavior Storing way more data
  5. Cassandra winning points: modeling Flexible schema Easy index and relations

    storage Distributed counters JVM compatibility - we’re a clojure shop One stop shop answer for our storage needs (almost)
  6. Our usage of cassandra Entity storage Papers, Sources Relations and

    Indices Articles found in sources, ordered by time of appearance Logs and events Events happening on a source Analytics View hits, contributor appearances
  7. Data layout Data types Row keys, column names and column

    values use a serializer UTF-8 String UUID (TimeUUID) Long Composite BYO Keyspaces The equivalent of databases Column Families The equivalent of tables no fixed amount of columns in rows (wide rows) column metadata can exist
  8. What storage looks like One can think of cassandra CFs

    as double depth hash tables 1 {"twitter": { 2 "Users": { 3 "@steeve": { 4 "name": "Steeve Morin" 5 }, 6 }, 7 "Followers": { 8 "@steeve": { "@pyr": null, "@camping": null } 9 }, 10 "Timelines": { 11 "@steeve": { "2012-07-25-00:00:00": "meetup !", 12 "2012-07-24-00:01:34": "foo"} 13 } 14 }}
  9. Cassandra schemas specifics Rows don’t need to look alike Columns

    are sorted by column name Column values can have arbitrary types You don’t need column values
  10. Denormalization, what and why ? Copying the same data in

    different places Reads are more expensive than writes Hard disk storage is a commodity
  11. Denormalization canonical example: before 1 SELECT * FROM UserFollowers f,

    Tweets t WHERE f.user_name = "@pyr" 2 AND t.user_name = f.followee_name;
  12. Consistency levels A way to express your requirements Read consistency

    Property ONE Response from the closest replica QUORUM (Replication Factor / 2 ) + 1 replicas must agree ALL All replicas must agree Write consistency Property ANY The write reached one node ONE The write must have been succesfully performed on at least one of the replicas QUORUM (Replication Factor / 2 ) + 1 replicas must have must have succesfully performed the write ALL All replicas must have performed the write
  13. Dealing with the CAP theorem Choose the strategy that matches

    the data you are handling Storing entities is sensitive Ensure a high consistency level, e.g: read and write at QUORUM Regular CF snapshots Storing events can sustain consistency mishaps writing at ONE should be sufficient
  14. Let’s talk numbers more than 15000 posts computed per second

    (peak) On average 200M per day associated social counters updated for analytics associated log event storage for scheduler input more than 3000 articles computed per second (peak) 600k paper editions per day each pulling from wide rows to filter, rank and output an edition
  15. Some gotchas Don’t forget to cache When possible, use fast

    disks (SSD) Give your instances space to breathe Split clusters Node operations aren’t free