Upgrade to Pro — share decks privately, control downloads, hide ads and more …

New Intro to Riak

New Intro to Riak

Joel Jacobson

August 21, 2013
Tweet

More Decks by Joel Jacobson

Other Decks in Technology

Transcript

  1. What is Riak? Key Value store + extras Distributed /

    Horizontally Scalable Fault Tolerant Highly available built for the web
  2. inspired by amazon dynamo white paper released to describe a

    database system to be used for their shopping cart Masterless, peer coordinated replication Consistent Hashing Eventually Consistent
  3. Riak Key-value store Simple operations; GET, PUT, DELETE Value is

    Opaque, with metadata Extras; Secondary indexes MapReduce full text search
  4. Horizontal Scalability Near linear Scalability Query load and data are

    spread evenly Add more nodes and get more; Ops/second storage capacity compute power (mapreduce)
  5. Fault tolerant no Single point of failure (SPOF) All Data

    is replicated CLusters self heal; Handoff, Active Anti Entropy cluster transparently survives Node Failure Network partition
  6. Highly Available Any Node Can Serve Client requests Fallbacks are

    used when nodes are down Always available for read and write requests Per-request quorums
  7. Quorums n = 3 r / w = 2 R

    = 1 - faster response time, less likely consistent r = all - slower response, greater consistency
  8. Node fails Request goes to fallback Handoff - data retuned

    to recovered node X X X X X X X X hash(“user_id”) Disaster scenario
  9. Automatically repair inconsistencies in data runs as a background process

    or Can be configured as a manual process active anti-entropy
  10. Network partitions or concurrent actors modifying the same data Riak

    provides two solutions to manage this: Last Write Wins Vector Clocks Conflict resolution
  11. Vector Clocks Every node has an ID Send last-seen vector

    clock in every “put” request Can be viewed as ‘commit history’ e.g. Git Lets you decide conflicts
  12. sibling creation 0 3 2 1 Object v1 Object v1

    0 3 2 1 Object v1 Siblings can be created by: Simultaneous writes Anti-entropy [{a,3}] [{a,2},{b,1}] [{a,3}] Object v1 Object v1 [{a,2},{b,1}]
  13. bitcask A fast, append-only key-value store Key space must fit

    in memory Suitable for bounded data, e.g. reference data
  14. Leveldb Append-only for very large data sets multiple levels Allows

    for more advanced querying (2i) includes compression (Snappy algorithm) Suitable for unbounded data
  15. memory Data is never persisted to disk Definable memory limits

    per vnode Configurable object expiry Useful for highly transient data supports secondary indexes
  16. client libraries Client libraries supported by Basho: Community supported languages

    and frameworks: C/C++, Clojure, Common Lisp, Dart, Django, Go, Grails, Griffon, Groovy, Haskell, .NET, Node.js, OCaml , Perl, PHP, Play, Racket, Scala, Smalltalk
  17. Using Riak as datastore for all back-end systems supporting Angry

    Birds Game-state storage, ID/Login, Payments, Push notifications, analytics, advertisements 9 clusters in use with over 100 nodes 263 million active monthly users
  18. Spine2 - storing 80 million+ patient data 500 complex messages

    per second 20,000 integrated end points 0% data loss 99.9% availability SLA
  19. Push to talk application Billions of requests daily > 50

    dedicated servers Everything stored in Riak
  20. MDC Allows data to be replicated between clusters in different

    data centers real-time and full sync uni-directional or bi-directional replication global load-balancing backups
  21. riak-cs S3 compatible object store Supports Objects of Arbitrary Content

    Type Up to 5TB multi-tenancy Per-tenant usage data and statistics on network I/O supports MDC