Riak Overview - includes 1.3 features

Riak Overview - includes 1.3 features

Riak architecture, operations, APIs, data model and querying, plus features from the latest 1.3 release.


Basho Technologies

March 25, 2013


  1. Riak Intro

  2. •  @basho •  basho.com •  github.com/basho •  docs.basho.com Us

  3. None
  4. What`s in store? •  At a High Level •  For

    Developers •  Under the Hood •  When and Why •  Some Use Cases •  Commercial Extensions •  1.3 and Roadmap
  5. At a High Level

  6. •  Dynamo-inspired key/value store •  with some extras: search, MapReduce,

    2i, links, pre- and post-commit hooks, pluggable backends, HTTP and binary interfaces •  Written in Erlang with C/C++ •  Open source under Apache 2 License Riak
  7. Riak’s Design Goals (1) •  High-availability •  Low-latency •  Horizontal

    Scalability •  Fault Tolerance •  Ops Friendliness •  Predictability
  8. Riak’s Design Goals (2) •  Design Informed by Brewer’s CAP

    Theorem and Amazon’s Dynamo Paper •  Riak is tuned to offer availability above all else
  9. For Developers

  10. Riak is a database that stores keys against values. Keys

    are grouped into a higher-level namespace called buckets.
  11. Riak doesn’t care what you store. It will accept any

    data type; things are stored on disk as binaries.
  12. None
  13. None
  14. None
  15. None
  16. Examples Application Type Key Value Session User/Session ID Session Data

    Advertising Campaign ID Ad Content Logs Date Log File Sensor Date, Date/Time Sensor Updates User Data Login, eMail, UUID User Attributes Content Title, Integer Text, JSON/XML/ HTTP document, images, etc.
  17. Two APIs 1.  HTTP (just like the web) 2.  Protocol

    Buffers (thank you, Google)
  18. Querying GET/PUT/DELETE MapReduce Full-Text Search Secondary Indexes (2i)

  19. Client Libraries Ruby, Node.js, Java, Python, Perl, OCaml, Erlang, PHP,

    C, Squeak, Smalltalk, Pharoah, Clojure, Scala, Haskell, Lisp, Go, .NET, Play, and more (supported by either Basho or the community).
  20. Under the Hood

  21. Consistent Hashing and Replicas Handoff and Rebalancing

  22. Masterless; deployed as a cluster of nodes

  23. None
  24. None
  25. None
  26. None
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. None
  34. Riak 1.3

  35. •  Automatic, self-healing property •  Repairs divergent, missing or corrupt

    replicas caused by hardware failure, bad disks, data corruption, and other failure modes •  Useful for large clusters, long term storage •  Uses hash tree exchange •  Minimal performance impact •  More on our blog Active Anti-Entropy
  36. New Look for Riak Control

  37. •  Expanded IPv6 support (protocol buffers and handoff interfaces) • 

    MapReduce backpressure •  Optionally send log messages to syslog Other Features
  38. Riak: when and why

  39. When Might Riak Make Sense When you have enough data

    to require >1 physical machine (preferably >5) When availability is more important than consistency (think “critical data”on “big data”) When your data can be modeled as keys and values; don’t be afraid to denormalize
  40. •  Cloud infrastructure management •  Machine, customer and API data

    •  “Design for failure” architecture “Enstratius relies on Riak to ensure that our cloud infrastructure management platform scales seamlessly, without interruption and performance bottlenecks, while meeting and exceeding internal requirements for high availability and data durability.”
  41. •  Scaling writes in MySQL became a bottleneck •  Master/slave

    replication made master nodes a single point of failure •  Multi-site replication ß vimeo.com/bashotech
  42. ß ricon2012.com •  Re-platform of e-commerce platform

  43. Session Storage •  First Basho customer in 2009 •  Every

    hit to a Mochi web property results in at least one read, maybe write to Riak •  Unavailability or high latency = lost ad revenue
  44. Ad Serving •  OpenX served ~4T ad in 2012 • 

    Started with CouchDB and Cassandra for various parts of infrastructure •  Now consolidating on Riak and Riak Core
  45. Riak for All Storage: Voxer

  46. Voxer: Post Growth •  ~60 Nodes total in prod • 

    100s of TBs of data (>1TB daily) •  ~400k Concurrent Users •  Billions of daily Requests
  47. Riak : Hybrid Solutions •  Riak with Postgres •  Riak

    with Elastic Search •  Riak with Hadoop •  Secondary analytics clusters
  48. Try Us On… •  Amazon AMIs •  Engine Yard • 

    Microsoft Azure VM Depot •  Riakon.com
  49. Commercial Software

  50. Riak Enterprise •  Multi-datacenter replication •  Real-time or full sync

    •  24/7 support
  51. •  Faster, with more connections between clusters •  Easier set

    up and configuration •  Better per-connection statistics •  Already used in production by many Riak Enterprise customers Advanced Multi-Datacenter
  52. Riak Cloud Storage •  Large object support •  S3-compatible API

    •  Multi-tenancy •  Reporting on usage •  Now open source
  53. Roadmap Stuff...

  54. Future Work •  CRDTs •  Tight Solr integration •  Greater

    consistency •  Lots of other good stuff, check Github
  55. •  docs.basho.com •  @basho •  github.com/basho Riak