Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Riak: What's all the fuss about?

Sam Elliott
September 12, 2012

Riak: What's all the fuss about?

An introduction to Riak, for Edlambda.

Focussing on Dynamo first, then Riak's extra features, and finally a little on Erlang/OTP (only why it was a good fit).

Sam Elliott

September 12, 2012
Tweet

More Decks by Sam Elliott

Other Decks in Programming

Transcript

  1. Sam Elliott
    EdLambda - 2012/09

    View Slide

  2. Introduction Dynamo Riak Erlang/OTP Conclusions
    • Dynamo
    • Riak
    • Erlang/OTP
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  3. Introduction Dynamo Riak Erlang/OTP Conclusions
    Basho & Riak
    basho
    • Riak is Open-Source
    • Most developers employed by Basho
    • Basho can provide professional support
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  4. Introduction Dynamo Riak Erlang/OTP Conclusions
    Dynamo
    Dynamo: Amazon’s Highly Available Key-value Store
    This paper presents the design and implementation of Dynamo, a
    highly available key-value storage system that some of Amazons
    core services use to provide an always-on experience. [DeCandia
    et al., 2007]
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  5. Introduction Dynamo Riak Erlang/OTP Conclusions
    CAP Theorem [Brewer, 2000]
    • Consistency
    • Availability
    • Partition-tolerance
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  6. Introduction Dynamo Riak Erlang/OTP Conclusions
    CAP Theorem [Brewer, 2000]
    • Consistency
    • Availability
    • Partition-tolerance
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  7. Introduction Dynamo Riak Erlang/OTP Conclusions
    Consistent Hashing [Karger et al., 1997]
    0
    2160
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  8. Introduction Dynamo Riak Erlang/OTP Conclusions
    Consistent Hashing [Karger et al., 1997]
    0
    2160 hash(<>,<>)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  9. Introduction Dynamo Riak Erlang/OTP Conclusions
    Consistent Hashing [Karger et al., 1997]
    0
    2160 hash(<>,<>)
    riak1.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  10. Introduction Dynamo Riak Erlang/OTP Conclusions
    Consistent Hashing [Karger et al., 1997]
    0
    2160 hash(<>,<>)
    riak1.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  11. Introduction Dynamo Riak Erlang/OTP Conclusions
    Nodes & Virtual Nodes
    0
    2160
    riak1.example.com
    riak2.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  12. Introduction Dynamo Riak Erlang/OTP Conclusions
    Nodes & Virtual Nodes
    0
    2160
    riak1.example.com
    riak2.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  13. Introduction Dynamo Riak Erlang/OTP Conclusions
    Nodes & Virtual Nodes
    0
    2160
    riak1.example.com
    riak2.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  14. Introduction Dynamo Riak Erlang/OTP Conclusions
    Nodes & Virtual Nodes
    0
    2160
    riak1.example.com
    riak2.example.com
    riak3.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  15. Introduction Dynamo Riak Erlang/OTP Conclusions
    Nodes & Virtual Nodes
    0
    2160
    riak1.example.com
    riak2.example.com
    riak3.example.com
    riak4.example.com
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  16. Introduction Dynamo Riak Erlang/OTP Conclusions
    Replicas & Quorums
    0
    2160 hash(<>,<>)
    read(<>, <>)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  17. Introduction Dynamo Riak Erlang/OTP Conclusions
    Replicas & Quorums
    0
    2160 hash(<>,<>)
    read(<>, <>, N=3)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  18. Introduction Dynamo Riak Erlang/OTP Conclusions
    Replicas & Quorums
    0
    2160 hash(<>,<>)
    read(<>, <>, N=3)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  19. Introduction Dynamo Riak Erlang/OTP Conclusions
    Replicas & Quorums
    0
    2160 hash(<>,<>)
    read(<>, <>, N=3, R=2)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  20. Introduction Dynamo Riak Erlang/OTP Conclusions
    Replicas & Quorums
    0
    2160 hash(<>,<>)
    read(<>, <>, N=3, R=2)
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  21. Introduction Dynamo Riak Erlang/OTP Conclusions
    Quorum Modes & Values
    Values: all, quorum, integer
    Reads:
    • r
    • pr
    Writes:
    • w
    • pw
    • dw
    Deletes:
    • r
    • w
    • pr
    • pw
    • dw
    • rw
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  22. Introduction Dynamo Riak Erlang/OTP Conclusions
    Quorum Modes & Values
    Values: all, quorum, integer
    Reads:
    • r
    • pr
    Writes:
    • w
    • pw
    • dw
    Deletes:
    • r
    • w
    • pr
    • pw
    • dw
    • rw
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  23. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  24. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  25. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2 A:3
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  26. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2
    A:2, B:1
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  27. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2 A:3
    A:2, B:1
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  28. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2 A:3
    A:2, B:1 A:3, B:2
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  29. Introduction Dynamo Riak Erlang/OTP Conclusions
    Vector Clocks
    A:1 A:2 A:3
    A:2, B:1
    A:4, B:1
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  30. Introduction Dynamo Riak Erlang/OTP Conclusions
    Riak
    Dynamo, with:
    • MapReduce
    • Links & Link-walking
    • Secondary Indexes
    • Fulltext Search
    • Hooks
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  31. Introduction Dynamo Riak Erlang/OTP Conclusions
    Riak
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  32. Introduction Dynamo Riak Erlang/OTP Conclusions
    Example Application
    An Online Shop
    • Products
    • Search
    • Related Products
    • Categories
    • Baskets
    • Invoices
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  33. Introduction Dynamo Riak Erlang/OTP Conclusions
    Products
    PUT /buckets/products/keys/thingy -3000
    Content -Type: application/json
    {...}
    GET /buckets/products/keys/thingy -3000
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  34. Introduction Dynamo Riak Erlang/OTP Conclusions
    Attachments I
    PUT /buckets/attachments/props
    n_val =3
    r=1
    w=all
    last_write_wins =true
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  35. Introduction Dynamo Riak Erlang/OTP Conclusions
    Attachments II
    POST /buckets/attachments/keys
    X-Riak -Index - productkey_bin : thingy -3000
    Content -type: image/jpeg
    ...
    GET /buckets/attachments/keys/bzPyg ...
    GET /buckets/attachments/index/productkey_bin /thingy -3000
    {" keys ":[" bzPyg ...", ...]}
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  36. Introduction Dynamo Riak Erlang/OTP Conclusions
    Search I
    bin/search -cmd install products
    bin/search -cmd search -doc products ’thingy 3000 ’
    bin/search -cmd search -doc products ’title:thingy ’
    GET /solr/products/select ?...
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  37. Introduction Dynamo Riak Erlang/OTP Conclusions
    Search II
    {
    schema ,
    [
    {version , "1.1"},
    {default_field , " description"},
    {default_op , "and"},
    {analyzer_factory , {erlang , text_analyzers , standard_analyzer_factory }}
    ],
    [
    {field , [
    {name , "title"},
    {required , true}
    ]},
    {field , [
    {name , " description"},
    {required , true}
    ]},
    {field , [
    {name , "id"},
    {analyzer_factory , {erlang , text_analyzers , noop_analyzer_factory }}
    ]},
    {field , [
    {name , " released_at"},
    {type , date}
    ]},
    {dynamic_field , [
    {name , "*"}
    ]}
    ]
    }
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  38. Introduction Dynamo Riak Erlang/OTP Conclusions
    Related Products
    PUT /buckets/products/keys/thingy -3000
    Content -Type: application /json
    Link: ; riaktag =" related",
    ; riaktag =" related"
    {...}
    GET /buckets/products/keys/thingy -3000/ products ,related ,1
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  39. Introduction Dynamo Riak Erlang/OTP Conclusions
    Categories
    PUT /buckets/products/keys/thingy -3000
    Content -Type: application/json
    X-Riak -Index - category_bin: electronics
    X-Riak -Index - category_bin: computers
    {...}
    GET /buckets/products/index/category_bin/electronics
    {" keys ":[" thingy -3000" , ...]}
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  40. Introduction Dynamo Riak Erlang/OTP Conclusions
    Basket I
    One Approach to Eventual Consistency:
    • Use allow_mult
    • Teach your application how to resolve conflicts
    • Fetch all siblings
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  41. Introduction Dynamo Riak Erlang/OTP Conclusions
    Basket II
    • Each client maintains their own per-session basket
    (Map product → amount)
    • Upon fetch, merge into a single basket
    (Map product → sum(amounts))
    In General:
    • Store the data somewhat like a vector clock
    • Teach your application how to resolve that structure into an
    object
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  42. Introduction Dynamo Riak Erlang/OTP Conclusions
    Basket II
    • Each client maintains their own per-session basket
    (Map product → amount)
    • Upon fetch, merge into a single basket
    (Map product → sum(amounts))
    In General:
    • Store the data somewhat like a vector clock
    • Teach your application how to resolve that structure into an
    object
    • CRDTs: Commutative Replicated Data Types [Shapiro and
    Baquero, 2011]
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  43. Introduction Dynamo Riak Erlang/OTP Conclusions
    Invoices
    A real job for MapReduce [Dean and Ghemawat, 2008]
    POST /mapred
    Content -type: application /json
    {
    "inputs" : {
    "bucket ":" invoices",
    "index ":" $key",
    "start ":0,
    "end ":" zzzz"
    },
    "query ": {
    "map ": ,
    "reduce ":
    }
    }
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  44. Introduction Dynamo Riak Erlang/OTP Conclusions
    Erlang/OTP
    • Concurrency
    • Fault-tolerance
    • Distribution
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  45. Introduction Dynamo Riak Erlang/OTP Conclusions
    Conclusions
    • Distributed databases aren’t trivial
    • Riak doesn’t hide anything
    • Riak’s extra features are quite cool
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  46. Introduction Dynamo Riak Erlang/OTP Conclusions
    Any Questions?
    Sam Elliott Riak: What’s all the fuss about?

    View Slide

  47. Introduction Dynamo Riak Erlang/OTP Conclusions
    References
    Eric A. Brewer. Towards robust distributed systems (abstract). In Proceedings of the nineteenth annual ACM
    symposium on Principles of distributed computing, PODC ’00, pages 7–, New York, NY, USA, 2000. ACM.
    ISBN 1-58113-183-6. doi: 10.1145/343477.343502. URL http://doi.acm.org/10.1145/343477.343502.
    Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of
    the ACM, 51(1):107–113, 2008. URL http://dl.acm.org/citation.cfm?id=1327492.
    G. DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
    Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon’s highly available
    key-value store. ACM SIGOPS Operating Systems Review, 41(6):205–220, 2007. URL
    http://dl.acm.org/citation.cfm?id=1294281.
    D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent hashing and random trees:
    Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the
    twenty-ninth annual ACM symposium on Theory of computing, pages 654–663. ACM, 1997. URL
    http://dl.acm.org/citation.cfm?id=258660.
    Marc Shapiro and C Baquero. A comprehensive study of Convergent and Commutative Replicated Data Types.
    2011. URL http://hal.archives-ouvertes.fr/inria-00555588/.
    Mathias Meyer. Riak Handbook. URL http://riakhandbook.com/.
    Sam Elliott Riak: What’s all the fuss about?

    View Slide