Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Riak - An Intro With Windows Azure

Riak - An Intro With Windows Azure

This slide deck is the part of the talk, generally centered around the topics and details of the Riak Architecture & related material. It currently doesn't have the Azure sample commands or other elements around that, as it is the live part of the presentation. I'll likely add these parts in the future though.

Adron Hall

March 08, 2013
Tweet

More Decks by Adron Hall

Other Decks in Technology

Transcript

  1. &
    Thursday, March 7, 13

    View Slide

  2. #WHOIS
    Adron Hall | @adron | Coder, Messenger, Recon
    Thursday, March 7, 13

    View Slide

  3. ഄা
    Thursday, March 7, 13

    View Slide

  4. Thursday, March 7, 13

    View Slide

  5. Thursday, March 7, 13

    View Slide

  6. Distributed, masterless, highly-available key/value store
    Thursday, March 7, 13

    View Slide

  7. Horizontal Scalability
    Fault-Tolerance
    Low-latency
    Ops Friendliness
    Predictability
    High-Availability
    DESIGN GOALS
    Thursday, March 7, 13

    View Slide

  8. When to use Riak...
    Thursday, March 7, 13

    View Slide

  9. Metadata
    Users/Profiles
    Object Storage
    Session Storage
    Sensor Data
    Logging Systems
    Record Systems
    Notification Systems
    RIAK USE CASES
    Thursday, March 7, 13

    View Slide

  10. IN PRODUCTION AT
    And 1000s more...
    Thursday, March 7, 13

    View Slide

  11. DATA MODEL
    Thursday, March 7, 13

    View Slide

  12. {“Key”:“Value”}
    • Values are stored against keys
    • Key/Value + Metadata = Object
    • Fundamental Unit of Replication
    • Any Datatype will work
    • Record to disk in binary format
    Thursday, March 7, 13

    View Slide

  13. <>/<>
    • Virtual Namespace
    • Bucket + Keys = Object Address
    • Buckets have properties
    • Objects in bucket inherit properties
    • No relationships between buckets
    Thursday, March 7, 13

    View Slide

  14. DATA ACCESS
    Thursday, March 7, 13

    View Slide

  15. INTERFACES
    HTTP API - Via a little piece of magic called Webmachine
    Protocol Buffers API - Thanks, Google!
    Largely-faithful REST implementation
    Compact, binary protocol
    Thursday, March 7, 13

    View Slide

  16. CLIENT LIBS
    Python
    Ruby
    PHP
    OCaml
    Java
    Perl
    Erlang
    Node.js
    C/C++
    Haskell
    Clojure
    Scala
    Go
    Dart
    .NET
    And more.
    Supported by
    either Basho
    or our community.
    Thursday, March 7, 13

    View Slide

  17. RIAK GIVES YOU
    [FOUR] WAYS TO STORE,
    RETRIEVE, AND QUERY
    DATA
    Thursday, March 7, 13

    View Slide

  18. 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    CRUD
    // PUT
    PUT  /buckets/bucket/keys/key            //  User-­‐defined  key
    POST  /buckets/bucket/keys/key        //  Riak-­‐defined  key
    DELETE  /buckets/bucket/keys/key      
    GET  /buckets/bucket/keys/key        
    // GET
    // DELETE
    Thursday, March 7, 13

    View Slide

  19. MapReduce
    Distributed processing system using Riak Pipe
    Efficient for targeted queries over known key range
    Write jobs in Erlang or JS. (Erlang more performant)
    Thursday, March 7, 13

    View Slide

  20. Secondary Indexing (2i)
    riak_object
    riak_object
    X-Riak-Index-email_bin
    X-Riak-Index-value_int
    [email protected]
    “42”
    Tag objects with custom metadata on PUT...
    Exact match and range queries...
    No multi-index queries yet...
    Pagination is on its way...
    Thursday, March 7, 13

    View Slide

  21. Riak Search
    Store and index documents (JSON, text, XML, etc)
    Current Riak Search supports subset of Solr API
    Next iteration (Yokozuna; in beta)will implement distributed
    Solr on Riak. It will be sexy.
    Looking for beta testers to help harden Yokozuna
    Thursday, March 7, 13

    View Slide

  22. ARCHITECTURE
    The scaleability and
    ease of operation
    goals inform
    architectural decisions.
    These come with tradeoffs.
    Consistent Hashing
    Virtual Nodes
    Append-only storage
    Handoff/Rebalancing
    Vector Clocks
    Active Anti-Entropy*
    Thursday, March 7, 13

    View Slide

  23. Consistent Hashing
    Location of data in the Riak ring is determined based on hash of
    bucket + key.
    Provides even distribution of storage and query load
    Trades off advantages gained from locality
    - e.g. Range queries and aggregates
    Thursday, March 7, 13

    View Slide

  24. Consistent Hashing
    Thursday, March 7, 13

    View Slide

  25. Virtual Nodes
    Unit of addressing and concurrency in Riak
    Each physical host manage many vnodes
    Partition count / physical machines = vnodes/machine*
    Decouples physical assets from data distribution. This provides:
    - simplicity in cluster sizing
    - failure isolation
    Thursday, March 7, 13

    View Slide

  26. Handoff/Rebalancing
    Mechanisms for data rebalancing
    When nodes join/leave cluster, handoff and rebalancing
    manage the date shuffling dynamically
    Trades off speed of convergence vs. effects on cluster
    performance
    - causes disk & network load
    Thursday, March 7, 13

    View Slide

  27. Vector Clocks
    VCs used to rectify object consistency at READ time.
    Lots of knobs to turn; well-documented
    Trades off space, speed, and complexity for safety
    - will store all sibling objects until resolved
    - can lead to object size issues
    Thursday, March 7, 13

    View Slide

  28. Append-Only Storage
    Riak provides a pluggable backend interface. (Write your own;
    we’ll probably hire you...)
    Bitcask, LevelDB are most-heavily used. Both are
    append - only
    Provides crash safety and speed.
    Trade off: periodic compaction/merge ops
    Thursday, March 7, 13

    View Slide

  29. RIAK 1.3
    (AKA “new hotness”)
    Active Anti Entropy
    MapReduce Improvements
    IPv6 Support
    Riaknostic included by default
    Much more
    Riak Control improvements
    Full release notes: https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md
    Thursday, March 7, 13

    View Slide

  30. FUTURE WORK*
    (1.4 and beyond)
    (* all code subject to ship early, late, or not at all)
    Dynamic Ring Size
    Yokozuna
    CRDTs/Data Types
    Riak Object
    Consistency
    2i Improvements
    Riak Pipe work
    Much more
    Thursday, March 7, 13

    View Slide

  31. S3-API compatible and supports per-tenant
    reporting for billing and metering use cases.
    Additional APIs on the way.
    Multi-tenant cloud storage software for
    public and private clouds.
    Designed to provide simple, available, distributed
    cloud storage at any scale.
    Stores files of arbitrary size. Under the hood
    stores 1MB chunks along side a manifest.
    Stateless proxy (CS) does chunking. Riak does
    distribution, storage, etc.
    Thursday, March 7, 13

    View Slide

  32. Data transfer is unidirectional (source -> sink).
    Bidirectional synchronization can be
    achieved by configuring a pair of connections
    between clusters.
    Extends Riak's capabilities with:
    - multi-datacenter replication
    - SNMP Configuration
    - JMX-Monitoring
    - 24x7 support from Basho Engineers
    One cluster acts as a "source cluster". The source
    cluster replicates its data to one or more
    "sink clusters" using either real-time or full sync.
    Thursday, March 7, 13

    View Slide

  33. RIAK COMMUNITY
    Mailing List - 1300 developers
    IRC - 200+ people every day yelling about software
    GitHub - 1000s of watchers; 200+ contributors to all projects
    Meetups - 10 Countries, 23 Cities, 3700+ Members & growing fast!
    Deployments - 1000s in production.
    Thursday, March 7, 13

    View Slide

  34. May 13-14th in New York City
    ricon.io/east.html
    Talks, hacking, parties
    Dedicated to the future of Riak and
    distributed systems in production
    REGISTER
    NOW! https://ricon-east-2013.eventbrite.com/?discount=lovevnodes
    Thursday, March 7, 13

    View Slide

  35. GETTING STARTED
    Downloads - http://docs.basho.com/riak/latest/downloads/
    Docs - http://docs.basho.com
    Riak Source Code - github.com/basho/riak
    All Basho source Code - github.com/basho/
    Riak Mailing List - http://bit.ly/riak-list
    Email or Tweet me @adron or [email protected]
    Thursday, March 7, 13

    View Slide

  36. Let’s Talk UI & CLI - Demo Things
    Thursday, March 7, 13

    View Slide

  37. #WHOIS
    Adron Hall | @adron | Coder, Messenger, Recon
    Thursday, March 7, 13

    View Slide