Firehose storage at paper.li

Firehose Storage at Paper.li Pierre-Yves Ritschard July 25, 2012

A Quick introduction paper.li Baseline: “Curation platform enabling millions of
users to become the publishers of their own daily newspapers.” Canonical use case: daily newspaper on an interest Feeds on a wide variety of “social media” sources @pyr Lead architect at Smallrivers, I like to build big stuﬀ Long time involvement in distributed systems, scalability Recent FP and big data convert

About paper.li

Before cassandra Standard RoR stack MySQL strongly normalized datastore Memcached

Storage workload Two write heavy pipelines Continuous source aggregation Edition
publisher Random read workload Huge mass of somewhat popular newspapers

Storage painpoints You know the drill Crazy cache warm up
Impossible schema changes Somewhat manageable read-workload Unmanageable write workload

Requirements Improving status quo I/O pressure release Constant time writes
Limiting operations overhead Going further More metadata Analytics Adapting behavior Storing way more data

Considered solutions Sharding MySQL HBase / Voldemort / ElephantDB Riak
Apache Cassandra

Cassandra winning points: operations Standalone stack P2P Architecture: no SPOF
Multi datacenter support Extensive JMX support

Cassandra winning points: modeling Flexible schema Easy index and relations
storage Distributed counters JVM compatibility - we’re a clojure shop One stop shop answer for our storage needs (almost)

Our usage of cassandra Entity storage Papers, Sources Relations and
Indices Articles found in sources, ordered by time of appearance Logs and events Events happening on a source Analytics View hits, contributor appearances

Data layout Data types Row keys, column names and column
values use a serializer UTF-8 String UUID (TimeUUID) Long Composite BYO Keyspaces The equivalent of databases Column Families The equivalent of tables no ﬁxed amount of columns in rows (wide rows) column metadata can exist

What storage looks like One can think of cassandra CFs
as double depth hash tables 1 {"twitter": { 2 "Users": { 3 "@steeve": { 4 "name": "Steeve Morin" 5 }, 6 }, 7 "Followers": { 8 "@steeve": { "@pyr": null, "@camping": null } 9 }, 10 "Timelines": { 11 "@steeve": { "2012-07-25-00:00:00": "meetup !", 12 "2012-07-24-00:01:34": "foo"} 13 } 14 }}

Cassandra schemas speciﬁcs Rows don’t need to look alike Columns
are sorted by column name Column values can have arbitrary types You don’t need column values

Denormalization, what and why ? Copying the same data in
diﬀerent places Reads are more expensive than writes Hard disk storage is a commodity

Denormalization canonical example: before 1 SELECT * FROM UserFollowers f,
Tweets t WHERE f.user_name = "@pyr" 2 AND t.user_name = f.followee_name;

Denormalization canonical example: after 1 SELECT * from Timelines where
KEY="@pyr";

Consistency levels A way to express your requirements Read consistency
Property ONE Response from the closest replica QUORUM (Replication Factor / 2 ) + 1 replicas must agree ALL All replicas must agree Write consistency Property ANY The write reached one node ONE The write must have been succesfully performed on at least one of the replicas QUORUM (Replication Factor / 2 ) + 1 replicas must have must have succesfully performed the write ALL All replicas must have performed the write

Dealing with the CAP theorem Choose the strategy that matches
the data you are handling Storing entities is sensitive Ensure a high consistency level, e.g: read and write at QUORUM Regular CF snapshots Storing events can sustain consistency mishaps writing at ONE should be suﬃcient

Let’s talk numbers more than 15000 posts computed per second
(peak) On average 200M per day associated social counters updated for analytics associated log event storage for scheduler input more than 3000 articles computed per second (peak) 600k paper editions per day each pulling from wide rows to ﬁlter, rank and output an edition

Some gotchas Don’t forget to cache When possible, use fast
disks (SSD) Give your instances space to breathe Split clusters Node operations aren’t free

Questions ? @pyr, https://github.com/pyr slides soon on http://spootnik.org

Firehose storage at paper.li

Firehose storage at paper.li

Pierre-Yves Ritschard

More Decks by Pierre-Yves Ritschard

Other Decks in Technology

Featured

Transcript