Slide 1

Slide 1 text

Intro & What’s New in Riak 1.4

Slide 2

Slide 2 text

•  @basho •  basho.com •  github.com/basho •  docs.basho.com Us

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

What`s in store? •  At a High Level •  For Developers •  Under the Hood •  When and Why •  Some Use Cases •  Commercial Extensions •  1.4 and Roadmap

Slide 5

Slide 5 text

At a High Level

Slide 6

Slide 6 text

•  Dynamo-inspired key/value store •  with some extras: search, MapReduce, 2i, links, pre- and post-commit hooks, pluggable backends, HTTP and binary interfaces •  Written in Erlang with C/C++ •  Open source under Apache 2 License Riak

Slide 7

Slide 7 text

Riak’s Design Goals (1) •  High availability •  Low-latency •  Horizontal Scalability •  Fault Tolerance •  Ops Friendliness •  Predictability

Slide 8

Slide 8 text

Riak’s Design Goals (2) •  Design informed by Brewer’s CAP Theorem and Amazon’s Dynamo Paper •  Riak is tuned to offer availability above all else

Slide 9

Slide 9 text

For Developers

Slide 10

Slide 10 text

Riak is a database that stores keys against values. Keys are grouped into a higher-level namespace called buckets.

Slide 11

Slide 11 text

Riak doesn’t care what you store. It will accept any data type; things are stored on disk as binaries.

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Examples Application Type Key Value Session User/Session ID Session Data Advertising Campaign ID Ad Content Logs Date Log File Sensor Date, Date/Time Sensor Updates User Data Login, eMail, UUID User Attributes Content Title, Integer Text, JSON/XML/ HTTP document, images, etc.

Slide 17

Slide 17 text

Two APIs 1.  HTTP (just like the web) 2.  Protocol Buffers (thank you, Google)

Slide 18

Slide 18 text

Querying GET/PUT/DELETE MapReduce Full-Text Search Secondary Indexes (2i)

Slide 19

Slide 19 text

Client Libraries Ruby, Node.js, Java, Python, Perl, OCaml, Erlang, PHP, C, Squeak, Smalltalk, Pharoah, Clojure, Scala, Haskell, Lisp, Go, .NET, Play, and more (supported by either Basho or the community).

Slide 20

Slide 20 text

Under the Hood

Slide 21

Slide 21 text

Consistent Hashing and Replicas Handoff and Rebalancing

Slide 22

Slide 22 text

Masterless; deployed as a cluster of nodes

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

•  Automatic, self-healing property •  Repairs divergent, missing or corrupt replicas caused by hardware failure, bad disks, data corruption, and other failure modes •  Useful for large clusters, long term storage •  Uses hash tree exchange •  Minimal performance impact •  More on our blog Active Anti-Entropy

Slide 35

Slide 35 text

Riak 1.4

Slide 36

Slide 36 text

Eventually Consistent Counters •  First publicly available, distributed data type in Riak •  PN Counters are capable of being both incremented (P) and decremented (N) •  Provide automatic conflict resolution after a network partition

Slide 37

Slide 37 text

Secondary Indexing Improvements •  2i queries are now sorted and client can request only first “n” results •  Pagination also allows queries to begin where “n” left off to deliver the rest of the results (can paginate through lists in order) •  Can also view start value, continuation value, end value, min/max

Slide 38

Slide 38 text

Staging in Riak Control

Slide 39

Slide 39 text

•  Progress bar for Handoff •  Reduced object storage overhead – best for small objects •  Updated PB properties •  Overload Protection for vnode processes •  Cascading real-time writes for Riak Enterprise multi-datacenter replication Other Features

Slide 40

Slide 40 text

Riak: When and Why

Slide 41

Slide 41 text

When Might Riak Make Sense When you have enough data to require >1 physical machine (preferably >5) When availability is more important than consistency (think “critical data”on “big data”) When your data can be modeled as keys and values; don’t be afraid to denormalize

Slide 42

Slide 42 text

User Case Studies

Slide 43

Slide 43 text

•  Cloud infrastructure management •  Machine, customer, and API data •  “Design for failure” architecture “Enstratius relies on Riak to ensure that our cloud infrastructure management platform scales seamlessly, without interruption and performance bottlenecks, while meeting and exceeding internal requirements for high availability and data durability.”

Slide 44

Slide 44 text

•  Scaling writes in MySQL became a bottleneck •  Master/slave replication made master nodes a single point of failure •  Multi-site replication ß vimeo.com/bashotech

Slide 45

Slide 45 text

ß ricon.io/archive/ ricon2012.html •  Re-platform of e-commerce platform

Slide 46

Slide 46 text

Social Authentication •  Social commerce marketplace •  Uses Riak to store all registered accounts and tokens for Facebook/ Twitter logins •  Looking to move more data over due to operational simplicity

Slide 47

Slide 47 text

Session Storage •  First Basho customer in 2009 •  Every hit to a Mochi web property results in at least one read, maybe write to Riak •  Unavailability or high latency = lost ad revenue

Slide 48

Slide 48 text

Ad Serving •  OpenX served ~4T ad in 2012 •  Started with CouchDB and Cassandra for various parts of infrastructure •  Now consolidating on Riak and Riak Core

Slide 49

Slide 49 text

Riak for All Storage: Voxer

Slide 50

Slide 50 text

Voxer: Post Growth •  ~60 Nodes total in prod •  100s of TBs of data (>1TB daily) •  ~400k Concurrent Users •  Billions of daily Requests

Slide 51

Slide 51 text

Riak : Hybrid Solutions •  Riak with Postgres •  Riak with Elastic Search •  Riak with Hadoop •  Secondary analytics clusters

Slide 52

Slide 52 text

Try Us On… •  Amazon AMIs •  Engine Yard •  Microsoft Azure VM Depot •  SoftLayer

Slide 53

Slide 53 text

Commercial Software

Slide 54

Slide 54 text

Riak Enterprise •  Multi-datacenter replication •  Real-time or full sync •  24/7 support

Slide 55

Slide 55 text

•  Faster, with more connections between clusters •  Easier set up and configuration •  Better per-connection statistics •  Supports SSL, NAT, and full sync scheduling Replication in Riak 1.4

Slide 56

Slide 56 text

Riak CS (cloud storage) •  Large object support •  S3-compatible API •  Multi-tenancy •  Reporting on usage •  Now open source •  Riak CS 1.4 coming soon…

Slide 57

Slide 57 text

Roadmap Stuff...

Slide 58

Slide 58 text

Future Work •  Tight Solr integration •  Greater consistency •  Faster data transfer between clusters •  Dynamic Ring resizing •  Lots of other good stuff, check Github

Slide 59

Slide 59 text

RICON.io A distributed systems conference RICON25Web for 25% off

Slide 60

Slide 60 text

Riak •  docs.basho.com •  @basho •  github.com/basho