Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What's New in Riak 1.4

Alex
July 12, 2013
940

What's New in Riak 1.4

Intro to Riak and a discussion of the new features and updates available with Riak 1.4.

Alex

July 12, 2013
Tweet

Transcript

  1. What`s in store? •  At a High Level •  For

    Developers •  Under the Hood •  When and Why •  Some Use Cases •  Commercial Extensions •  1.4 and Roadmap
  2. •  Dynamo-inspired key/value store •  with some extras: search, MapReduce,

    2i, links, pre- and post-commit hooks, pluggable backends, HTTP and binary interfaces •  Written in Erlang with C/C++ •  Open source under Apache 2 License Riak
  3. Riak’s Design Goals (1) •  High availability •  Low-latency • 

    Horizontal Scalability •  Fault Tolerance •  Ops Friendliness •  Predictability
  4. Riak’s Design Goals (2) •  Design informed by Brewer’s CAP

    Theorem and Amazon’s Dynamo Paper •  Riak is tuned to offer availability above all else
  5. Riak is a database that stores keys against values. Keys

    are grouped into a higher-level namespace called buckets.
  6. Riak doesn’t care what you store. It will accept any

    data type; things are stored on disk as binaries.
  7. Examples Application Type Key Value Session User/Session ID Session Data

    Advertising Campaign ID Ad Content Logs Date Log File Sensor Date, Date/Time Sensor Updates User Data Login, eMail, UUID User Attributes Content Title, Integer Text, JSON/XML/ HTTP document, images, etc.
  8. Client Libraries Ruby, Node.js, Java, Python, Perl, OCaml, Erlang, PHP,

    C, Squeak, Smalltalk, Pharoah, Clojure, Scala, Haskell, Lisp, Go, .NET, Play, and more (supported by either Basho or the community).
  9. •  Automatic, self-healing property •  Repairs divergent, missing or corrupt

    replicas caused by hardware failure, bad disks, data corruption, and other failure modes •  Useful for large clusters, long term storage •  Uses hash tree exchange •  Minimal performance impact •  More on our blog Active Anti-Entropy
  10. Eventually Consistent Counters •  First publicly available, distributed data type

    in Riak •  PN Counters are capable of being both incremented (P) and decremented (N) •  Provide automatic conflict resolution after a network partition
  11. Secondary Indexing Improvements •  2i queries are now sorted and

    client can request only first “n” results •  Pagination also allows queries to begin where “n” left off to deliver the rest of the results (can paginate through lists in order) •  Can also view start value, continuation value, end value, min/max
  12. •  Progress bar for Handoff •  Reduced object storage overhead

    – best for small objects •  Updated PB properties •  Overload Protection for vnode processes •  Cascading real-time writes for Riak Enterprise multi-datacenter replication Other Features
  13. When Might Riak Make Sense When you have enough data

    to require >1 physical machine (preferably >5) When availability is more important than consistency (think “critical data”on “big data”) When your data can be modeled as keys and values; don’t be afraid to denormalize
  14. •  Cloud infrastructure management •  Machine, customer, and API data

    •  “Design for failure” architecture “Enstratius relies on Riak to ensure that our cloud infrastructure management platform scales seamlessly, without interruption and performance bottlenecks, while meeting and exceeding internal requirements for high availability and data durability.”
  15. •  Scaling writes in MySQL became a bottleneck •  Master/slave

    replication made master nodes a single point of failure •  Multi-site replication ß vimeo.com/bashotech
  16. Social Authentication •  Social commerce marketplace •  Uses Riak to

    store all registered accounts and tokens for Facebook/ Twitter logins •  Looking to move more data over due to operational simplicity
  17. Session Storage •  First Basho customer in 2009 •  Every

    hit to a Mochi web property results in at least one read, maybe write to Riak •  Unavailability or high latency = lost ad revenue
  18. Ad Serving •  OpenX served ~4T ad in 2012 • 

    Started with CouchDB and Cassandra for various parts of infrastructure •  Now consolidating on Riak and Riak Core
  19. Voxer: Post Growth •  ~60 Nodes total in prod • 

    100s of TBs of data (>1TB daily) •  ~400k Concurrent Users •  Billions of daily Requests
  20. Riak : Hybrid Solutions •  Riak with Postgres •  Riak

    with Elastic Search •  Riak with Hadoop •  Secondary analytics clusters
  21. Try Us On… •  Amazon AMIs •  Engine Yard • 

    Microsoft Azure VM Depot •  SoftLayer
  22. •  Faster, with more connections between clusters •  Easier set

    up and configuration •  Better per-connection statistics •  Supports SSL, NAT, and full sync scheduling Replication in Riak 1.4
  23. Riak CS (cloud storage) •  Large object support •  S3-compatible

    API •  Multi-tenancy •  Reporting on usage •  Now open source •  Riak CS 1.4 coming soon…
  24. Future Work •  Tight Solr integration •  Greater consistency • 

    Faster data transfer between clusters •  Dynamic Ring resizing •  Lots of other good stuff, check Github