Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Overview of Riak CS

Tom Santero
December 03, 2012

Overview of Riak CS

Slides from my talk discussing Basho's Riak CS at the CloudDC Meetup: http://www.meetup.com/CloudDC/events/90432962/

Tom Santero

December 03, 2012
Tweet

More Decks by Tom Santero

Other Decks in Technology

Transcript

  1. Founded in 2008 by ex-Akamai execs, engineers Sponsors of Riak,

    an open source (Apache2) distributed database 100+ employees, 75% engineers O!ces in Herndon, San Francisco, Cambridge, London and Tokyo We sell Riak EDS (Open Source + Multi-Datacenter replication) plus support, training, services ... and Riak CS! Basho Wednesday, December 5, 12
  2. On March 27, 2012 Basho announced a new product called

    Riak CS Wednesday, December 5, 12
  3. Riak CS is... enterprise cloud storage Riak multi-tenancy per-user metering

    multi-region HA o"ering built on top of large object storage S3 API Wednesday, December 5, 12
  4. Riak distributed, masterless highly available key value store PROS: CONS:

    high read/write availability predictable latency minimal maintenance required I/O bound network is very chatty permissive API Wednesday, December 5, 12
  5. STORE ANY TYPE OF DATA YOU LIKE 01101011 01101001 01110100

    01110100 01100101 01101110 01110011 00100000 01110000 01110101 01110000 01110000 01101001 01100101 01110011 00100000 01101000 01100001 01101101 01110011 01110100 01100101 01110010 01110011 00100000 01110011 01110101 01100111 01100001 01110010 00100000 01100111 01101100 01101001 01100100 01100101 01110010 01110011 00100000 00100001 Wednesday, December 5, 12
  6. STORE ANY TYPE OF DATA YOU LIKE and read it

    back the same wayednesday, December 5, 12
  7. USERS multi-tenancy: Riak CS will track individual usage/stats BASIC CONCEPTS

    access_key secret_key users identi#ed by users authenticated by Wednesday, December 5, 12
  8. BUCKETS users create buckets buckets are like folders store objects

    in buckets names are globally unique BASIC CONCEPTS Wednesday, December 5, 12
  9. OBJECTS stored in buckets objects are opaque store any #le

    type #le sizes up to 5GB* BASIC CONCEPTS Wednesday, December 5, 12
  10. ACLs Access Control Lists permissions on buckets permissions on objects

    permissions on permissions BASIC CONCEPTS Wednesday, December 5, 12
  11. MDC GLOBAL LOW-LATENCY OBJECT STORAGE FOR PROVIDERS FOR ENTERPRISE global

    availability de#ne global availability regions DC redundancy DC redundancy “[deploying Riak CS] reduces the risk of using AWS and allows customers to store their data in their own data centers, on their own terms.” - Alex Williams, TechCrunch Wednesday, December 5, 12
  12. Patch to enable Riak CS as secondary storage Coming Soon:

    CLOUDSTACK INTEGRATION https://reviews.apache.org/r/8123/ Wednesday, December 5, 12
  13. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Wednesday, December 5, 12
  14. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API 1. user uploads an object Wednesday, December 5, 12
  15. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Wednesday, December 5, 12
  16. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API 1 MB 2. Riak CS breaks object into 1 MB chunks 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB Wednesday, December 5, 12
  17. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB Wednesday, December 5, 12
  18. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 3. Riak CS streams chunks to Riak nodes Wednesday, December 5, 12
  19. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Wednesday, December 5, 12
  20. Riak Node Riak Node Riak Node Riak Node Riak Node

    Large Object Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API Riak CS S3 API Reporting API 4. Riak replicates and stores chunks Wednesday, December 5, 12
  21. consistent hashing replicas virtual nodes (vnodes) handoff procedues gossip protocols

    anti-entropy REPLICATION and ADMINISTRATION Wednesday, December 5, 12
  22. Riak Object {“key”: “value”} values stored against keys key/value +

    metadata = object fundamental unit of replication Wednesday, December 5, 12
  23. Buckets <<bucket>>/<<key>> virtual namespace bucket + key = object address

    buckets have properties all objects inherit bucket props Wednesday, December 5, 12
  24. Quorum requests N R W PR/PW DW 3 2 2

    Wednesday, December 5, 12
  25. The Ring • 160-bit integer keyspace • divided into #xed

    number of evenly-sized partitions 32 partitions 0 2160/2 2160/4 Wednesday, December 5, 12
  26. The Ring • 160-bit integer keyspace • divided into #xed

    number of evenly-sized partitions • partitions are claimed by nodes in the cluster 32 partitions node 0 node 1 node 2 node 3 0 2160/2 2160/4 Wednesday, December 5, 12
  27. The Ring • 160-bit integer keyspace • divided into #xed

    number of evenly-sized partitions • partitions are claimed by nodes in the cluster • replicas go to the N partitions following the key node 0 node 1 node 2 node 3 Wednesday, December 5, 12
  28. The Ring • 160-bit integer keyspace • divided into #xed

    number of evenly-sized partitions • partitions are claimed by nodes in the cluster • replicas go to the N partitions following the key node 0 node 1 node 2 node 3 hash(“meetups/CloudDC”) N=3 Wednesday, December 5, 12
  29. Scaling $ riak start $ riak-admin cluster join <node> $

    riak-admin cluster plan $ riak-admin cluster commit Wednesday, December 5, 12
  30. $ riak-admin cluster plan =============================== Staged Changes ================================ Action Nodes(s)

    ------------------------------------------------------------------------------- join '[email protected]' join '[email protected]' join '[email protected]' ------------------------------------------------------------------------------- NOTE: Applying these changes will result in 1 cluster transition ############################################################################### After cluster transition 1/1 ############################################################################### ================================= Membership ================================== Status Ring Pending Node ------------------------------------------------------------------------------- valid 100.0% 25.0% '[email protected]' valid 0.0% 25.0% '[email protected]' valid 0.0% 25.0% '[email protected]' valid 0.0% 25.0% '[email protected]' ------------------------------------------------------------------------------- Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 Transfers resulting from cluster changes: 48 16 transfers from '[email protected]' to '[email protected]' 16 transfers from '[email protected]' to '[email protected]' 16 transfers from '[email protected]' to '[email protected]' $ riak-admin cluster commit Cluster changes committed Wednesday, December 5, 12
  31. Failures • node fails X X X X X X

    X X Wednesday, December 5, 12
  32. Failures • node fails • requests go to fallback X

    X X X X X X X hash(“meetups/CloudDC”) Wednesday, December 5, 12
  33. Failures • node fails • requests go to fallback •

    node comes back hash(“meetups/CloudDC”) Wednesday, December 5, 12
  34. Failures • node fails • requests go to fallback •

    node comes back • “Hando"” - data returns to recovered node hash(“meetups/CloudDC”) Wednesday, December 5, 12
  35. Failures • node fails • requests go to fallback •

    node comes back • “Hando"” - data returns to recovered node • normal operations resume hash(“meetups/CloudDC”) Wednesday, December 5, 12
  36. stats DTrace & built-in support track access & storage per

    user monitor total cluster ops inspect ops with DTrace probes create custom billing policies Wednesday, December 5, 12
  37. OPERATIONAL STATS exposed via HTTP resource: /riak-cs/stats block bucket object

    GET, PUT, DELETE LIST KEYS, CREATE, DELETE, GET/PUT ACL GET, PUT, DELETE HEAD, GET/PUT ACL HISTOGRAMS & COUNTERS Wednesday, December 5, 12
  38. QUERY USAGE STATS GET s3://usage/access_key/options/start_time/end_time access_key: MT_WDIUW64WFQLUP6IOO options: (a =

    access, b = storage ; j = JSON, x = XML) start_time: 20121017T140000Z (ISO8601) end_time: 20121018T140000Z (ISO8601) via HTTP or S3 Wednesday, December 5, 12
  39. { "Access": { "Errors": [], "Nodes": [ { "Node": "[email protected]",

    "Samples": [ { "EndTime": "20121018T130000Z", "StartTime": "20121018T120000Z", "UsageRead": { "BytesOut": 10509, "Count": 1 } }, { "EndTime": "20121018T130000Z", Wednesday, December 5, 12
  40. "UsageRead": { "BytesOut": 10509, "Count": 1 } }, { "EndTime":

    "20121018T000000Z", "KeyWrite": { "BytesIn": 104857600, "Count": 1 }, "ListBuckets": { "BytesOut": 406, "Count": 1 }, "StartTime": "20121017T230000Z" }, { "EndTime": "20121018T000000Z", Wednesday, December 5, 12
  41. { "EndTime": "20121020T020000Z", "KeyWrite": { "UserErrorBytesIn": 10240000000, "UserErrorBytesOut": 224, "UserErrorCount":

    1 }, you can’t store objects > 5GB USER ERRORS: requests that result in 400-499 response codes Wednesday, December 5, 12
  42. { "Access": "not_requested", "Storage": { "Errors": [], "Samples": [ {

    "EndTime": "20121018T060204Z", "StartTime": "20121018T060203Z", "some_bucket": { "Bytes": 107167497049, "Objects": 106 }, "empty_bucket": { "Bytes": 0, "Objects": 0 Wednesday, December 5, 12
  43. { "EndTime": "20121017T060202Z", "StartTime": "20121017T060201Z", "some_bucket": { "Bytes": 106014063449, "Objects":

    104 }, "empty_bucket": { "Bytes": 0, "Objects": 0 } } ] } } Wednesday, December 5, 12
  44. Provisioning API @client = Fog::RiakCS::Provisioning.new( :riakcs_access_key_id => RIAK_CS_ADMIN_KEY, :riakcs_secret_access_key =>

    RIAK_CS_ADMIN_SECRET, :host => RIAK_CS_HOST, :port => RIAK_CS_PORT, :scheme => RIAK_CS_SCHEME ) @client.create_user(email, name) @client.list_users @client.get_user(key_id) @client.enable_user(key_id) @client.disable_user(key_id) @client.regrant_secret(key_id) Wednesday, December 5, 12
  45. Usage API @client = Fog::RiakCS::Usage.new( :riakcs_access_key_id => RIAK_CS_ADMIN_KEY, :riakcs_secret_access_key =>

    RIAK_CS_ADMIN_SECRET, :host => RIAK_CS_HOST, :port => RIAK_CS_PORT, :scheme => RIAK_CS_SCHEME ) @client.get_usage( key_id, :format => :json, :types => [access, storage], :start_time => start_time, :end_time => end_time ) Wednesday, December 5, 12