Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012

The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012

Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/top-five-questions-about-nosql/jonathan-ellis

Cb6e6da05b5b943d2691ceefa3381cad?s=128

Big Data Spain

November 16, 2012
Tweet

Transcript

  1. Five questions for your NoSQL solution! Jonathan Ellis CTO, DataStax

    Project Chair, Apache Cassandra
  2. ©2012 DataStax how do I model my application?

  3. ©2012 DataStax Popular options • Key/value • Tabular • Document

    • Graph?
  4. ©2012 DataStax Schema is your friend { "id": "e451dd42-ece3-11e1-a0a3-34159e154f4c", "name":

    "jbellis", "state": "TX", "birthdate": "1/1/1976", "email_addresses": ["jbellis@gmail", "jbellis@datastax.com"], }
  5. ©2012 DataStax SQL can be your friend too CREATE TABLE

    users ( id uuid PRIMARY KEY, name text, state text, birth_date date ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > ‘1950-01-01’;
  6. ©2012 DataStax CREATE TABLE users ( id uuid PRIMARY KEY,

    name text, state text, birth_date date ); CREATE TABLE users_addresses ( user_id uuid REFERENCES users, email text ); SELECT * FROM users NATURAL JOIN users_addresses; Collections
  7. ©2012 DataStax CREATE TABLE users ( id uuid PRIMARY KEY,

    name text, state text, birth_date date ); CREATE TABLE users_addresses ( user_id uuid REFERENCES users, email text ); SELECT * FROM users NATURAL JOIN users_addresses; Collections X
  8. ©2012 DataStax CREATE TABLE users ( id uuid PRIMARY KEY,

    name text, state text, birth_date date, email_addresses set<text> ); UPDATE users SET email_addresses = email_addresses + {‘jbellis@gmail.com’, ‘jbellis@datastax.com’}; Collections
  9. ©2012 DataStax Joins don’t scale • No joins • No

    subqueries • No aggregation functions* or GROUP BY • ORDER BY?
  10. ©2012 DataStax SELECT * FROM tweets WHERE user_id IN (SELECT

    follower FROM followers WHERE user_id = ’driftx’) followers ? tweets
  11. ©2012 DataStax CREATE TABLE timeline ( user_id uuid, tweet_id timeuuid,

    tweet_author uuid, tweet_body text, PRIMARY KEY (user_id, tweet_id) ); Clustering in Cassandra user_id tweet_id _author _body jbellis 3290f9da.. rbranson lorem jbellis 3895411a.. tjake ipsum ... ... ... driftx 3290f9da.. rbranson lorem driftx 71b46a84.. yzhang dolor ... ... ... yukim 3290f9da.. rbranson lorem yukim e451dd42.. tjake amet ... ... ...
  12. ©2012 DataStax CREATE TABLE timeline ( user_id uuid, tweet_id timeuuid,

    tweet_author uuid, tweet_body text, PRIMARY KEY (user_id, tweet_id) ); Clustering in Cassandra user_id tweet_id _author _body jbellis 3290f9da.. rbranson lorem jbellis 3895411a.. tjake ipsum ... ... ... driftx 3290f9da.. rbranson lorem driftx 71b46a84.. yzhang dolor ... ... ... yukim 3290f9da.. rbranson lorem yukim e451dd42.. tjake amet ... ... ... SELECT * FROM timeline WHERE user_id = ’driftx’;
  13. ©2012 DataStax how does it perform?

  14. ©2012 DataStax VLDB benchmark

  15. ©2012 DataStax Locking

  16. ©2012 DataStax Efficiency

  17. ©2012 DataStax UPDATE users SET email_addresses = email_addresses + {...}

    WHERE user_id = ‘jbellis’;
  18. ©2012 DataStax Durability

  19. ©2012 DataStax Log-structured storage engine Memory Hard drive Memtable write(

    , ) k1 c1:v1 Commit log
  20. ©2012 DataStax Memory Hard drive Memtable write( , ) k1

    c1:v1 Commit log k1 c1:v1 k1 c1:v1
  21. ©2012 DataStax Memory Hard drive write( , ) k1 c2:v2

    k1 c1:v1 k1 c1:v1 k1 c2:v2 c2:v2
  22. ©2012 DataStax Memory Hard drive k1 c1:v1 k1 c1:v1 k1

    c2:v2 c2:v2 write( , ) k2 c1:v1 c2:v2 k2 c1:v1 c2:v2 k2 c1:v1 c2:v2
  23. ©2012 DataStax Memory Hard drive k1 c1:v1 k1 c1:v4 k1

    c2:v2 c2:v2 write( , ) k1 c1:v4 c3:v3 k2 c1:v1 c2:v2 k2 c1:v1 c2:v2 k1 c1:v4 c3:v3 c3:v3
  24. ©2012 DataStax Memory Hard drive SSTable flush k1 c1:v4 c2:v2

    k2 c1:v1 c2:v2 c3:v3 index / BF cleanup
  25. ©2012 DataStax No random writes

  26. ©2012 DataStax The gory details

  27. ©2012 DataStax Larger than memory datasets

  28. ©2012 DataStax how does it handle failure?

  29. ©2012 DataStax Classic partitioning with SPOF partition 1 partition 2

    partition 3 partition 4 router client
  30. ©2012 DataStax Availability • “High availability implies that a single

    fault will not bring down your system. Not ‘we’ll recover quickly.’” -- Ben Coverston: DataStax • “The biggest problem with failover is that you're almost never using it until it really hurts. It's like backups that you never test.” -- Rick Branson: Instagram
  31. ©2012 DataStax Fully distributed, no SPOF client p1 p1 p1

    p3 p6
  32. ©2012 DataStax Multiple datacenters

  33. ©2012 DataStax

  34. ©2012 DataStax Self-healing Client request Coordinator Replica internal request internal

    response response 1 2 3 4
  35. ©2012 DataStax Self-healing Client request Coordinator Replica internal request internal

    response response 1 2 3 4
  36. ©2012 DataStax Self-healing Client request Coordinator Replica internal request 1

    2 replica fails timeout response 4
  37. ©2012 DataStax Self-healing Client request Coordinator Replica internal request 1

    2 X replica fails timeout response 4
  38. ©2012 DataStax Self-healing Client request Coordinator Replica internal request 1

    2 4 replica fails timeout response hint 3
  39. ©2012 DataStax Self-healing Client request Coordinator Replica internal request 1

    2 4 X replica fails timeout response hint 3
  40. ©2012 DataStax Other healing modes • AntiEntropyService • Read repair

  41. ©2012 DataStax Dynamic snitch (dealing with partial failure) Client Coordinator

    40% busy 90% busy 30% busy
  42. ©2012 DataStax how does it scale?

  43. ©2012 DataStax VLDB benchmark

  44. ©2012 DataStax Scaling antipatterns • Metadata servers • Router bottlenecks

    • Overloading existing nodes when adding capacity
  45. ©2012 DataStax how flexible is it?

  46. ©2012 DataStax

  47. ©2012 DataStax Data model: Realtime Portfolios StockHist stock last GOOG

    $95.52 AAPL $186.10 AMZN $112.98 LiveStocks stock date price GOOG 2011-01-01 $8.23 GOOG 2011-01-02 $6.14 GOOG 2011-001-03 $7.78 user stock shares jbellis GOOG 80 jbellis LNKD 20 yukim AMZN 100
  48. ©2012 DataStax Data model: Analytics worst_date loss 2011-07-23 -$34.81 2011-03-11

    -$11432.24 2011-05-21 -$1476.93 Portfolio1 HistLoss Portfolio2 Portfolio3
  49. ©2012 DataStax Data model: Analytics stock rdate return GOOG 2011-07-25

    $8.23 GOOG 2011-07-24 $6.14 GOOG 2011-07-23 $7.78 AAPL 2011-07-25 $15.32 AAPL 2011-07-24 $12.68 10dayreturns INSERT OVERWRITE TABLE 10dayreturns SELECT a.stock, b.date as rdate, b.price - a.price FROM StockHist a JOIN StockHist b ON (a.stock = b.stock AND date_add(a.date, 10) = b.date);
  50. ©2012 DataStax Data model: Analytics portfolio rdate preturn Portfolio1 2011-07-25

    $118.21 Portfolio1 2011-07-24 $60.78 Portfolio1 2011-07-23 -$34.81 Portfolio2 2011-07-25 $2143.92 Portfolio3 2011-07-24 -$10.19 portfolio_returns INSERT OVERWRITE TABLE portfolio_returns SELECT portfolio, rdate, SUM(b.return) FROM portfolios a JOIN 10dayreturns b ON (a.stock = b.stock) GROUP BY portfolio, rdate;
  51. ©2012 DataStax Data model: Analytics INSERT OVERWRITE TABLE HistLoss SELECT

    a.portfolio, rdate, minp FROM ( SELECT portfolio, min(preturn) as minp FROM portfolio_returns GROUP BY portfolio ) a JOIN portfolio_returns b ON (a.portfolio = b.portfolio and a.minp = b.preturn); worst_date loss 2011-07-23 -$34.81 2011-03-11 -$11432.24 2011-05-21 -$1476.93 Portfolio1 HistLoss Portfolio2 Portfolio3
  52. ©2012 DataStax

  53. ©2012 DataStax Some Cassandra users

  54. Questions? • http://www.flickr.com/photos/26817893@N05/2573006312/ • http://www.flickr.com/photos/rowanbank/7686239548 • http://www.flickr.com/photos/mervtheswerve/6081933265 • http://www.flickr.com/photos/dg_pics/2526208830 •

    http://www.flickr.com/photos/wainwright/351684037 • http://www.flickr.com/photos/mikeneilson/1606662529 • http://www.flickr.com/photos/sbisson/3852905534 • http://www.flickr.com/photos/breadnbadger/2674928517 Image credits