Slide 1

Slide 1 text

1 Riak & Riak CS Enterprise Grade NoSQL Distributed Data Store and Cloud Storage Solutions

Slide 2

Slide 2 text

AGENDA •  Riak Overview •  Riak Architecture •  Use Cases •  Riak EDS •  Riak CS •  Q & A

Slide 3

Slide 3 text

About Basho Our Mission is to Be The Leader in Distributed Systems •  Founded January 2008 •  115+ employees •  Headquartered in Cambridge, with regional offices in San Francisco, Washington DC, London and Tokyo •  Makers of Riak- A popular distributed key- value store •  Thousands of Users Worldwide including over 20% of the Fortune 50 •  30,000+ downloads per month now up from 19,500 in Dec 2011 •  Strategic Partners include Citrix, IDC Frontier, Yahoo! Japan, and Microsoft

Slide 4

Slide 4 text

Riak •  Master-Slave architecture •  Application sharding •  Distributed model •  Active-Active with write scalability •  Active-Passive Say “NO” to Say “Yes” to

Slide 5

Slide 5 text

Riak Riak is a distributed NoSQL key-value store. Simple operations •  Get •  Put •  Delete

Slide 6

Slide 6 text

Key-Value Data Model •  Keys are grouped into buckets. •  All data (objects) are referenced by keys •  Object is composed of metadata and value Object/key Operations KEY VALUE KEY VALUE KEY VALUE bucket

Slide 7

Slide 7 text

Masterless & Highly Available Any node can serve client requests Fallbacks are used when nodes are down Always accepts read and write requests Per-request quorums

Slide 8

Slide 8 text

Consistent Hashing & The Ring •  160-bit integer keyspace •  Divided into fixed number of evenly-sized partitions •  Partitions are claimed by nodes in the cluster •  Replicas go to the N partitions following the key 32 partitions N=3 node 0 node 1 node 2 node 3 hash(“product/iphone”) 2160/4 2160/2 0

Slide 9

Slide 9 text

Failure Scenario •  Node fails •  Requests go to fallback nodes hash(“product/iphone”) node 0 node 1 node 2 node 3 X X X X X X X X

Slide 10

Slide 10 text

Hinted Handoff •  Node comes back •  “Handoff” - data returns to recovered node •  Normal operations resume hash(“product/iphone”) node 0 node 1 node 2 node 3

Slide 11

Slide 11 text

Riak’s core capability Scalable Add commodity hardware to get more [throughput | processing | storage]

Slide 12

Slide 12 text

Riak’s core capability Fault Tolerant All nodes participate equally (no SPOF) All data is replicated (n=3 by default) Cluster transparently survives node failure & network partition

Slide 13

Slide 13 text

Tunable Consistency •  n_val - number of replica to store; bucket-level setting. Defaults to “3”. •  w - number of replicas required for a successful write; Defaults to “2”. •  r - number of replica acks required for a successful read. request-level setting. Defaults to “2”. •  pr, pw & dw • Tweak consistency vs. availability

Slide 14

Slide 14 text

Two APIs HTTP (just like the web) Protocol Bu"ers (thank you, Google) Client Libraries Ruby, Node.js, Java, Python, Perl, Erlang, PHP, C, Scala, Haskell, Lisp,.NET, Play, and more (supported by either Basho or the community).

Slide 15

Slide 15 text

Riak Backend • Riak has a pluggable backend architecture • Bitcask, LevelDB are used the most in production depending on use-case • All writes are appends to a file • This provides crash safety and fast writes

Slide 16

Slide 16 text

Accessing Data in Riak Retrieving Single Objects •  Support for retrieving the object associated with a particular bucket / key •  Support for retrieving all of the keys associated with a particular bucket Object/Key Operations Collecting, Parsing, and Storing Data •  Distributed, full-text search engine with an easy-to-use query language, a Solr-like HTTP interface and a Apache Lucene-style query syntax •  Support for a wide variety of mime types, including JSON, plain text, XML and Erlang) •  Ideal for indexing JSON documents, as indexes are built automatically from a schema. Riak Search Seeking Reverse Lookups on Data Stored •  Provides the ability, at write time, to tag an object stored in Riak with one or more values (key/value metadata), which can then be queried •  Useful for finding data that is based on terms other than an objects’ bucket/key pair, or for adding metadata values to a binary object or opaque blob Secondary Indexes (Riak 2i) Processing a Large Dataset •  Provides the general ability to analyze and aggregate data in phases with data locality •  Features Javascript support and Erlang for performance benefit MapReduce Riak Search and 2I Query Results Can be Used as an Input to MapReduce

Slide 17

Slide 17 text

Always-Available Scale-Out Fast Ultra-Low Latency Any data types Programmer ease-of-use Riak Technical Architecture Riak Clients! (Ruby, Java, Node.js, PHP, .NET, etc)! Webmachine! (HTTP)! Protocol Buffers! (Binary, Persistent Connect)! MDC Replication! * Riak EnterpriseDS Only! Riak SNMP / JMX! * Riak EnterpriseDS Only! Riak KVS" (with 2i, " MapReduce)! Riak Search" (Distributed, Full- Text)! Riak Pipe" ! Riak Core" (Ring Management, Partitioning, Anti-Entropy, Replication, Version Control)! Pluggable Storage Backends! Bitcask! LevelDB! Riak Control!

Slide 18

Slide 18 text

What Riak is good for ?

Slide 19

Slide 19 text

What Riak Isn’t •  NOT Relational •  No fixed schema •  NOT Right for Every Project •  Large Objects (Riak CS is a good fit here) •  Dynamic Queries(SQL)

Slide 20

Slide 20 text

Modeling Applications in Riak

Slide 21

Slide 21 text

Ideal Riak Scenarios •  When you have enough data to require >1 physical machine (preferably > 4) •  When availability is the top requirement •  When your data can be modeled as keys and values When to Use Popular Use Cases •  Ad Networks •  Digital Media •  On-Line Games •  Social Networks •  Social Analysis •  Cloud Operators •  Messaging Services •  Product Catalogs •  Document Management •  Health Care Information Management

Slide 22

Slide 22 text

Riak Production Users- growing & growing …. Mobile, Retail & Social Cloud Computing & Advertising Security and Others Gaming, Payments and Others

Slide 23

Slide 23 text

Web / Mobile App Growth Case Study for Top Rated Apple App Store App •  #4 most popular Apple App Store Social Networking App at EOY behind Facebook, Skype and Twitter •  Truly Viral Growth: Scaled 10x between Thanksgiving and New Years Day •  Required scaling across multiple IaaS / hosting providers •  Surpassed one billion operations per day

Slide 24

Slide 24 text

Mobile-to-Mobile Content Store Bump – Low Latency and Always Available •  800 million pieces of structural data in Riak, including Photos, Chats, and Contact Cards. •  10 million active users •  77 million downloads to date •  Switched to Riak in August 2011 •  #7 Most Downloaded iPhone App

Slide 25

Slide 25 text

Enstratus •  It is a cloud infrastructure management solution for deploying and managing enterprise-class applications •  Moved from MySQL to Riak. Reasons- •  Write Scalability •  Resilience to failure across multiple datacenters •  Stores machine and state information, and data supporting analytics and audit control. •  George Reese gave an excellent talk during Ricon last year, link below http://vimeo.com/54887751 “As I’ve looked at a number of problem domains from customers and our own systems, you see this pattern where a relational database has been used just because it’s the default… and the reality is that more of the world is eventually consistent than not”, said George Reese, CTO of enStratus

Slide 26

Slide 26 text

Riak MDC- (EDS) Cloud Mobile Social Data Center #2 Data Center #3 Data Center #1 Multi-Data Center Replication Applications, Users and Machines Generate Data 1 2 Riak Stores and Manages Data Efficiently and Effectively •  Clusters are local to regional users to solve latency •  Replication is uni-directional, remote clusters can be setup to replicate data back to a primary cluster, thus synchronizing bi-directionally. •  Easily deploy in many regional zones •  Write everywhere solution •  Easy to scale, can easily add additional data centers

Slide 27

Slide 27 text

Full-sync Replication

Slide 28

Slide 28 text

Real-time Replication

Slide 29

Slide 29 text

Product Information Repository

Slide 30

Slide 30 text

Multi-Device Session Store Case Study Showcases Seamless User Experience The Global Session Store Manages a Seamless User Session throughout a Customer’s multi-mode experience, from Web to device Philadelphia Data Center Denver

Slide 31

Slide 31 text

Backups •  Bitcask and LevelDB are both Log-structure stores; cp, rsync, tar, custom backup tools will work •  FS-level snapshots of directory can be done while node is running •  Backups aren't yet perfected and that future releases will have more efficient, specialized backup methods for each backend

Slide 32

Slide 32 text

Stats and Monitoring (1) • Riak exposes data about current operating status (counters, histograms, etc.) via the HTTP /stats endpoint or ‘riak- admin status’ • Anything that speaks HTTP can be plugged into Riak • Plugins exist for most OSS monitoring tools (munin, cacti, nagios, graphite, statsd)

Slide 33

Slide 33 text

Stats and Monitoring (2) •  ‘riaknostic’ is a suite of diagnostic checks that can used to debug your cluster before it’s in production; checks for common misconfigurations •  Riak Control is a full-fledged management GUI that Basho develops and maintains.

Slide 34

Slide 34 text

Riak 1.3 – GA Now •  Active Anti Entropy •  Replication enhancements for MDC •  IPv6 support •  New Look for Riak Control

Slide 35

Slide 35 text

What is Riak CS? Key features: •  Multi-Tenant support •  User Authentication and Authorization •  Amazon S3 API-compatibility •  Per-Tenant visibility •  Provisioning, Metering, Billing and Reporting •  Multi part upload up to 5TB

Slide 36

Slide 36 text

Riak CS in Action

Slide 37

Slide 37 text

Reporting Large Objects AuthZ Riak CS Use Cases Storage for Cloud Computing S3 Without AWS Cloud Drive (General Content Storage) Backup-as-a- Service Archival and Preservation Integration with Workflow Multi-Tenancy

Slide 38

Slide 38 text

Multi-Datacenter Replication •  Multi-site storage replication •  Data locality •  Availability in disaster scenarios •  Active backup

Slide 39

Slide 39 text

Multi-Datacenter Replication •  Global information for users, buckets and manifests is streamed in real- time •  Objects are replicated in full or real- time sync mode •  If a client requests an object from a site but not all of the blocks that constitute that object have been replicated to that site, missing blocks will be requested and streamed from the “origin” cluster How It Works

Slide 40

Slide 40 text

Riak CS Roadmap •  Swift API •  Keystone Integration •  S3 Features •  COPY Object •  Object Versioning •  Cloud Stack Integration

Slide 41

Slide 41 text

Riak 1.4 (Roadmap) •  Dynamic Ring Sizing •  2i Pagination •  Performance and Scaling Improvements

Slide 42

Slide 42 text

Riak EDS or CS Does data unavailability costs thousands of $/minute? Riak EDS (Enterprise Data Store) Do you want to build a cloud storage service for your business? Riak CS (Cloud Storage)

Slide 43

Slide 43 text

Basho’s Product Family Distributed Data Technology is Our Passion EnterpriseDS Open Source Distributed Database Commercial Distributed Database Distributed Cloud Storage Platform •  Always-available, scalable, low-cost NoSQL database •  Over 35,000 Downloads per Month •  Thousands of users worldwide •  Available Since Sept 2009 •  Version 1.0 unveiled September 2011 •  Subsequent versions released along side Riak EDS •  Adds multi-data center replication, monitoring & 24x7 support •  Requires commercial contract and secure download •  Version 1.1 launched with Riak Control in Feb 2012 •  Version 1.2 launched in August 2012 •  Version 1.3 launched in February 2013 • Expands with multi- tenancy, large object support, metering and Amazon S3 API • Launched on March 27, 2012 • Used by multiple global cloud operators • Software released to open source on March 20th

Slide 44

Slide 44 text

Resources Basho Docs http://docs.basho.com/ Riak Fast Track http://docs.basho.com/riak/1.1.4/tutorials/fast-track/ http://docs.basho.com/riakcs/latest/riakcs-tutorials/fast-track/ Basho Blog http://basho.com/blog/technical/

Slide 45

Slide 45 text

Sign up here- http://ricon.io/west.html

Slide 46

Slide 46 text