$30 off During Our Annual Pro Sale. View Details »

Riak Cloud Storage

Andy Gross
November 30, 2012

Riak Cloud Storage

Overview of Riak and Riak CS from the 2012 CloudStack Collaboration Conference.

Andy Gross

November 30, 2012
Tweet

More Decks by Andy Gross

Other Decks in Technology

Transcript

  1. Simple Available Cloud Storage
    CloudStack Collaboration Conference
    Friday, November 30, 12

    View Slide

  2. whoami
    andy gross
    @argv0
    chief architect
    basho technologies
    Friday, November 30, 12

    View Slide

  3. basho
    Founded 2008 by ex-Akamai execs, engineers
    Sponsors of the Riak open source (Apache2)
    distributed database
    100+ employees, 75% engineers, in SF, MA, DC,
    London, Tokyo
    We sell Riak EDS (Open Source + Multi-Datacenter
    replication), plus support, training, services.
    ... and Riak CS!
    Friday, November 30, 12

    View Slide

  4. Friday, November 30, 12

    View Slide

  5. Design
    Goals
    high-availability
    low-latency
    horizontal scalability
    fault tolerance
    ops friendliness
    predictability
    Friday, November 30, 12

    View Slide

  6. Overview
    Friday, November 30, 12

    View Slide

  7. On March 27, 2012
    Basho
    announced a new
    product called
    Riak CS
    Friday, November 30, 12

    View Slide

  8. Riak CS is...
    enterprise cloud storage
    Riak
    S3-compatibility
    multi-tenancy
    per user billing
    o!ering
    built
    on top
    of
    large object storage
    Friday, November 30, 12

    View Slide

  9. Riak distributed, masterless
    highly available key value store
    PROS:
    CONS:
    high read/write availability
    predictable latency
    minimal maintenance required
    I/O bound
    network is very chatty
    permissive API
    Friday, November 30, 12

    View Slide

  10. Enabling you to host your own
    PUBLIC
    PRIVATE
    &
    CLOUDS
    Friday, November 30, 12

    View Slide

  11. STORE MEDIA
    STORE BACKUPS
    Friday, November 30, 12

    View Slide

  12. STORE
    ANY TYPE
    OF DATA
    YOU LIKE
    01101011 01101001 01110100 01110100
    01100101 01101110 01110011 00100000
    01110000 01110101 01110000 01110000
    01101001 01100101 01110011 00100000
    01101000 01100001 01101101 01110011
    01110100 01100101 01110010 01110011
    00100000 01110011 01110101 01100111
    01100001 01110010 00100000 01100111
    01101100 01101001 01100100 01100101
    01110010 01110011 00100000 00100001
    Friday, November 30, 12

    View Slide

  13. STORE
    ANY TYPE
    OF DATA
    YOU LIKE
    and read it back the same way.
    01101011 01101001 01110100 01110100
    01100101 01101110 01110011 00100000
    01110000 01110101 01110000 01110000
    01101001 01100101 01110011 00100000
    01101000 01100001 01101101 01110011
    01110100 01100101 01110010 01110011
    00100000 01110011 01110101 01100111
    01100001 01110010 00100000 01100111
    01101100 01101001 01100100 01100101
    01110010 01110011 00100000 00100001
    01101011 01101001 01110100 01110100
    01100101 01101110 01110011 00100000
    01110000 01110101 01110000 01110000
    01101001 01100101 01110011 00100000
    01101000 01100001 01101101 01110011
    01110100 01100101 01110010 01110011
    00100000 01110011 01110101 01100111
    01100001 01110010 00100000 01100111
    01101100 01101001 01100100 01100101
    01110010 01110011 00100000 00100001
    Friday, November 30, 12

    View Slide

  14. USERS
    multi-tenancy:
    Riak CS will track
    individual usage/stats
    BASIC
    CONCEPTS
    access_key secret_key
    users identi"ed by users authenticated by
    Friday, November 30, 12

    View Slide

  15. BUCKETS
    users create buckets
    buckets are like folders
    store objects in buckets
    names are globally unique
    BASIC
    CONCEPTS
    Friday, November 30, 12

    View Slide

  16. OBJECTS
    stored in buckets
    objects are opaque
    store any "le type
    "le sizes up to 5GB*
    BASIC
    CONCEPTS
    Friday, November 30, 12

    View Slide

  17. ACLs
    Access Control Lists
    permissions on buckets
    permissions on objects
    permissions on permissions
    BASIC
    CONCEPTS
    Friday, November 30, 12

    View Slide

  18. on cloud services...
    Friday, November 30, 12

    View Slide

  19. “cloud” =
    handwaving over
    complexity
    Friday, November 30, 12

    View Slide

  20. cloud = compute +
    storage
    Friday, November 30, 12

    View Slide

  21. cloud = global scale
    Friday, November 30, 12

    View Slide

  22. cloud = distributed
    systems
    Friday, November 30, 12

    View Slide

  23. Desirable
    Properties
    high-availability
    low-latency
    horizontal scalability
    fault tolerance
    ops friendliness
    predictability
    Friday, November 30, 12

    View Slide

  24. Announcing Today:
    Multi-DC Replication
    • Replicates objects globally to many
    datacenters
    • Allows public cloud providers to de!ne
    global regions
    • Allows enterprises global, low latency
    access to storage
    “[deploying Riak CS] reduces the risk of using AWS and allows
    customers to store their data in their own data centers, on their
    own terms.” - Alex Williams, TechCrunch
    Friday, November 30, 12

    View Slide

  25. Coming Soon:
    Integration with
    CloudStack
    • Patch to enable Riak CS as secondary storage
    • https://reviews.apache.org/r/8123/
    • Targeting 4.0.1 release
    • Deeper integration also in the works
    Friday, November 30, 12

    View Slide

  26. Architecture
    Friday, November 30, 12

    View Slide

  27. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Friday, November 30, 12

    View Slide

  28. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    1. user uploads
    an object
    Friday, November 30, 12

    View Slide

  29. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Friday, November 30, 12

    View Slide

  30. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    1 MB
    2. Riak CS
    breaks object
    into 1 MB chunks
    1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB
    Friday, November 30, 12

    View Slide

  31. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB
    Friday, November 30, 12

    View Slide

  32. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB
    3. Riak CS
    streams chunks
    to Riak nodes
    Friday, November 30, 12

    View Slide

  33. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Friday, November 30, 12

    View Slide

  34. Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Large Object
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    Riak CS
    S3
    API
    Reporting
    API
    4. Riak
    replicates
    and stores
    chunks
    Friday, November 30, 12

    View Slide

  35. THE HOOD
    UNDER
    Friday, November 30, 12

    View Slide

  36. consistent hashing
    replicas
    virtual nodes (vnodes)
    handoff
    gossip protocols
    anti-entropy
    REPLICATION and ADMINISTRATION
    Friday, November 30, 12

    View Slide

  37. THE RING
    DIVIDED
    EVENLY
    INTO
    160-BIT INT
    KEYSPACE
    PARTITIONS
    Friday, November 30, 12

    View Slide

  38. VNODES
    PARTITIONS
    node 0
    node 1
    node 2
    CLAIM
    Friday, November 30, 12

    View Slide

  39. VNODES
    PARTITIONS
    node 0
    node 1
    node 2
    Friday, November 30, 12

    View Slide

  40. REBALANCE
    VNODES
    PARTITIONS
    node 0
    node 1
    node 2
    node 3
    +
    Friday, November 30, 12

    View Slide

  41. consistent hashing
    node 0
    node 1
    node 2
    node 3
    hash(“bucket/key”)
    N = 3
    Friday, November 30, 12

    View Slide

  42. consistent hashing
    node 0
    node 1
    node 2
    node 3
    hash(“bucket/key”)
    N = 3
    Friday, November 30, 12

    View Slide

  43. Quorum
    requests
    N R W
    PR/PW DW
    Friday, November 30, 12

    View Slide

  44. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    Friday, November 30, 12

    View Slide

  45. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    Friday, November 30, 12

    View Slide

  46. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    requests go to fallback
    Friday, November 30, 12

    View Slide

  47. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    Friday, November 30, 12

    View Slide

  48. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    node comes back online
    Friday, November 30, 12

    View Slide

  49. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    Friday, November 30, 12

    View Slide

  50. disaster
    scenario
    node 0
    node 1
    node 3
    node 2
    normal operations resume
    Friday, November 30, 12

    View Slide

  51. Operations
    Friday, November 30, 12

    View Slide

  52. stats
    DTrace
    &
    built-in
    support
    track access &
    storage per user
    monitor total
    cluster ops
    inspect ops with
    DTrace probes
    create custom
    billing policies
    Friday, November 30, 12

    View Slide

  53. OPERATIONAL STATS
    exposed via HTTP resource: /riak-cs/stats
    block bucket object
    GET, PUT, DELETE
    LIST KEYS, CREATE,
    DELETE, GET/PUT ACL
    GET, PUT, DELETE
    HEAD, GET/PUT ACL
    HISTOGRAMS & COUNTERS
    Friday, November 30, 12

    View Slide

  54. TRACK INDIVIDUAL USER’S
    THE
    “USAGE”
    BUCKET
    ACCESS STORAGE
    Friday, November 30, 12

    View Slide

  55. QUERY USAGE STATS
    GET s3://usage/access_key/options/start_time/end_time
    access_key: MT_WDIUW64WFQLUP6IOO
    options: (a = access, b = storage ; j = JSON, x = XML)
    start_time: 20121017T140000Z (ISO8601)
    end_time: 20121018T140000Z (ISO8601)
    via HTTP or S3
    Friday, November 30, 12

    View Slide

  56. {
    "Access": {
    "Errors": [],
    "Nodes": [
    {
    "Node": "[email protected]",
    "Samples": [
    {
    "EndTime": "20121018T130000Z",
    "StartTime": "20121018T120000Z",
    "UsageRead": {
    "BytesOut": 10509,
    "Count": 1
    }
    },
    {
    "EndTime": "20121018T130000Z",
    Friday, November 30, 12

    View Slide

  57. "UsageRead": {
    "BytesOut": 10509,
    "Count": 1
    }
    },
    {
    "EndTime": "20121018T000000Z",
    "KeyWrite": {
    "BytesIn": 104857600,
    "Count": 1
    },
    "ListBuckets": {
    "BytesOut": 406,
    "Count": 1
    },
    "StartTime": "20121017T230000Z"
    },
    {
    "EndTime": "20121018T000000Z",
    Friday, November 30, 12

    View Slide

  58. {
    "EndTime": "20121020T020000Z",
    "KeyWrite": {
    "UserErrorBytesIn": 10240000000,
    "UserErrorBytesOut": 224,
    "UserErrorCount": 1
    },
    you can’t store objects > 5GB
    USER ERRORS:
    requests that result in 400-499 response codes
    Friday, November 30, 12

    View Slide

  59. {
    "Access": "not_requested",
    "Storage": {
    "Errors": [],
    "Samples": [
    {
    "EndTime": "20121018T060204Z",
    "StartTime": "20121018T060203Z",
    "some_bucket": {
    "Bytes": 107167497049,
    "Objects": 106
    },
    "empty_bucket": {
    "Bytes": 0,
    "Objects": 0
    Friday, November 30, 12

    View Slide

  60. {
    "EndTime": "20121017T060202Z",
    "StartTime": "20121017T060201Z",
    "tsantero": {
    "Bytes": 106014063449,
    "Objects": 104
    },
    "empty_bucket": {
    "Bytes": 0,
    "Objects": 0
    }
    }
    ]
    }
    }
    Friday, November 30, 12

    View Slide

  61. FIND US FOR A
    DEVELOPER’S TRIAL
    http://docs.basho.com/riakcs/latest/
    Q&A
    Friday, November 30, 12

    View Slide