Slide 1

Slide 1 text

OVERVIEW AND 1.4 Monday, July 22, 13

Slide 2

Slide 2 text

Mark Phillips @pharkmillups [email protected] Dir., Community & Developer Evangelism Monday, July 22, 13

Slide 3

Slide 3 text

About Basho About Riak Riak Data Access, APIs, and Languages Querying Riak 1.4 Selected Use Cases Getting Started and becoming a Riak Fanboy ROUGH AGENDA Monday, July 22, 13

Slide 4

Slide 4 text

Founded late 2007 by group of ex-Akamai, Mitre, Apple 130 Employees; >60% Dev; Distributed Company Sponsors of Riak, the Apache 2.0-licensed project Basho sells Riak Enterprise (Riak / Riak CS + Multi DC Repl) We generate recurring revenue and are hiring* :) Monday, July 22, 13

Slide 5

Slide 5 text

Written by Basho to satisfy internal use case Apache 2.0-licensed First OSS release August 2009; 1.0 in Sept 2011 Mostly-written in Erlang with some C/C++ Dynamo-inspired Monday, July 22, 13

Slide 6

Slide 6 text

Any node serves requests Deployed as cluster of nodes (>=5) Automatic Failover Durable No SPOF RIAK Distributed, masterless, highly- available key/value store Built-in replication (n=3) Dynamic data repartitioning Monday, July 22, 13

Slide 7

Slide 7 text

RIAK DESIGN GOALS High availability Low latency (and durable!) Horizontal Scalability Fault tolerance Ops-friendly Predictability Monday, July 22, 13

Slide 8

Slide 8 text

Data Access / APIs / Languages Monday, July 22, 13

Slide 9

Slide 9 text

{“KEY” : “VALUE”} Values are stored against keys Key/Value + metadata = object Fundamental unit of replication Any data type will work. Encoded as binaries on disk. Soft limit of ~4MB on object size. Riak CS for larger values. Monday, July 22, 13

Slide 10

Slide 10 text

<>/<> Virtual Namespace Bucket + Key = object address Buckets have properties All objects in buckets inherit properties No relationships between buckets Monday, July 22, 13

Slide 11

Slide 11 text

INTERFACES HTTP API - Via a little piece of magic called Webmachine Protocol Buffers API - Thanks, Google! Largely-faithful REST implementation Compact, binary protocol Monday, July 22, 13

Slide 12

Slide 12 text

CLIENT LIBS Python Ruby PHP OCaml Java Perl Erlang Node.js C/C++ Haskell Clojure Scala Go Dart .NET And more. Supported by either Basho or our community. Monday, July 22, 13

Slide 13

Slide 13 text

RIAK GIVES YOU [FOUR] WAYS TO STORE, RETRIEVE, AND QUERY DATA Monday, July 22, 13

Slide 14

Slide 14 text

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 CRUD // PUT PUT  /buckets/bucket/keys/key          //  User-­‐defined  key POST  /buckets/bucket/keys/key        //  Riak-­‐defined  key DELETE  /buckets/bucket/keys/key       GET  /buckets/bucket/keys/key         // GET // DELETE Monday, July 22, 13

Slide 15

Slide 15 text

MapReduce Distributed processing system using Riak Pipe Efficient for targeted queries over known key range Write jobs in Erlang Monday, July 22, 13

Slide 16

Slide 16 text

Riak Search Store and index documents (JSON, text, XML, etc) Current Riak Search supports subset of Solr API Next iteration (Yokozuna; in beta)will implement distributed Solr on Riak. It will be sexy. Looking for beta testers Monday, July 22, 13

Slide 17

Slide 17 text

Secondary Indexing (2i) riak_object riak_object X-Riak-Index-email_bin X-Riak-Index-value_int “[email protected]” “42” Tag objects with custom metadata on PUT Exact match and range queries No multi-index queries yet Pagination and index-term return added in 1.4 Monday, July 22, 13

Slide 18

Slide 18 text

- critical data When you have: - that should always be available - and can be modeled as keys and values* (* Hint: at scale, almost everything looks like a k/v store. Don’t be afraid to denormalize.) WHEN TO USE RIAK Monday, July 22, 13

Slide 19

Slide 19 text

Metadata Users/Profiles Object Storage Sessions Sensor Data Logging Systems Record Systems Notification Systems RIAK USE CASES Monday, July 22, 13

Slide 20

Slide 20 text

Riak 1.4 Monday, July 22, 13

Slide 21

Slide 21 text

2i Enhancements - Pagination and streaming results now possible - Results now sorted by index values and then keys - Matched index value is returned on ranges upon request Monday, July 22, 13

Slide 22

Slide 22 text

http://localhost:10018/buckets/tweets/index/hashtags_bin/ri/ru?max_results=5&return_terms=true Monday, July 22, 13

Slide 23

Slide 23 text

Riak Control Enhancements - Riak Control is the Basho supported, OSS GUI - Staged clustering changes from 1.2 now in 1.4 Control - Standalone Node Management added for single-node ops Monday, July 22, 13

Slide 24

Slide 24 text

Control: Staged Changes Monday, July 22, 13

Slide 25

Slide 25 text

Control: Committed Changes Monday, July 22, 13

Slide 26

Slide 26 text

Control: Cluster Stabilizing Monday, July 22, 13

Slide 27

Slide 27 text

Control: Standalone Node Monday, July 22, 13

Slide 28

Slide 28 text

Client API Enhancements - Client-specified timeouts added - Protocol Buffers supports all bucket props - Streaming list-buckets - PB interface now binds to multiple interfaces and ports Monday, July 22, 13

Slide 29

Slide 29 text

Data Types - Counters - PN Counter is now available; goes up and down :) - Accessible via newly-added PB and HTTP endpoint - Type of CRDTs (first of many) - Like buttons, upvotes, etc. Monday, July 22, 13

Slide 30

Slide 30 text

Monday, July 22, 13

Slide 31

Slide 31 text

Object Storage Compactness - New binary format - Reduces storage overhead (especially for small objects) - Default in 1.4; must enable if upgrading Monday, July 22, 13

Slide 32

Slide 32 text

Additional 1.4 Hotness - Improvements to ‘riak-admin transfers’ - Lager upgraded to 2.0 - ‘riak attach’ modified to use ‘-remsh’ - More than 170 bugs and issues resolved Monday, July 22, 13

Slide 33

Slide 33 text

Selected Use Cases Monday, July 22, 13

Slide 34

Slide 34 text

IN PRODUCTION AT And 1000s more Monday, July 22, 13

Slide 35

Slide 35 text

VOXER Using Riak for all operational storage and serving of data Super-useful communication platform for people and businesses Monday, July 22, 13

Slide 36

Slide 36 text

INITIAL STATS 11 Riak Nodes ~ 500GB dataset ~ 20k peak concurrent users ~ 4MM Daily request Monday, July 22, 13

Slide 37

Slide 37 text

Monday, July 22, 13

Slide 38

Slide 38 text

GROWTH STATS* 100 Nodes ~1TB Data incoming/day 400k concurrent users 2 billion requests/day Grew from 11 to 80 nodes in ~30 days Monday, July 22, 13

Slide 39

Slide 39 text

Moved from Cassandra to Riak 100s Nodes, several clusters Add storage, serving Custom C backend Billions requests/day https://vimeo.com/53480727 Impression counting Using Riak Enterprise for MDC Monday, July 22, 13

Slide 40

Slide 40 text

S3-API compatible and supports per-tenant reporting for billing and metering use cases. Additional APIs on the way. Multi-tenant cloud storage software for public and private clouds. Designed to provide simple, available, distributed cloud storage at any scale. Stores files of arbitrary size. Under the hood stores 1MB chunks along side a manifest. Stateless proxy (CS) does chunking. Riak does distribution, storage, etc. Monday, July 22, 13

Slide 41

Slide 41 text

Data transfer is unidirectional (source -> sink). Bidirectional synchronization can be achieved by configuring a pair of connections between clusters. Extends Riak's capabilities with: - multi-datacenter replication - SNMP Configuration - JMX-Monitoring - 24x7 support from Basho Engineers One cluster acts as a "source cluster". The source cluster replicates its data to one or more "sink clusters" using either real-time or full sync. Monday, July 22, 13

Slide 42

Slide 42 text

MULTI DC REPL Cluster-to-cluster replication over N data centers “Source” cluster talk to one or more “sink” clusters Hot failover between source and sink. Two forms of replication - Full sync - periodic exchanges of deltas (via merkle) - Real time - bi directional repl between more than one Monday, July 22, 13

Slide 43

Slide 43 text

$$, Community, RICON and Getting Started Monday, July 22, 13

Slide 44

Slide 44 text

RIAK COMMUNITY Mailing List - 1500 developers IRC - 200+ people every day yelling about software GitHub - 1000s of watchers; 300+ contributors to all projects Meetups - 10 Countries, 23 Cities, 3700+ Members Deployments - 1000s in production. Monday, July 22, 13

Slide 45

Slide 45 text

GETTING STARTED Docs - docs.basho.com Riak Source Code - github.com/basho/riak All Basho source Code - github.com/basho/ Riak Mailing List - http://bit.ly/FjChC Email me - [email protected] Downloads - http://docs.basho.com/riak/latest/downloads/ Monday, July 22, 13

Slide 46

Slide 46 text

October 29-30 in San Francisco ricon.io/west.html Talks, hacking, parties Dedicated to the future of Riak and distributed systems in production REGISTER NOW! http://ricon-west-2013.eventbrite.com/ ricon.io/west.html Monday, July 22, 13

Slide 47

Slide 47 text

QUESTIONS Mark Phillips [email protected] Dir., Community & Developer Evangelism @pharkmillups Monday, July 22, 13