Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to Riak with...

Intro to Riak with...

The slide deck I've consolidated & streamlined specifically for describing Riak, giving a quick overall high level overview & then jumping into building a cluster & using the database.

Adron Hall

March 13, 2013
Tweet

More Decks by Adron Hall

Other Decks in Technology

Transcript

  1. Wednesday, March 13, 13

    View Slide

  2. #WHOIS
    Adron Hall | @adron | Coder, Messenger, Recon
    Wednesday, March 13, 13

    View Slide

  3. ഄা
    Wednesday, March 13, 13

    View Slide

  4. Distributed, masterless, highly-available key/value store
    Wednesday, March 13, 13

    View Slide

  5. Horizontal Scalability
    Fault-Tolerance
    Low-latency
    Ops Friendliness
    Predictability
    High-Availability
    DESIGN GOALS
    Wednesday, March 13, 13

    View Slide

  6. Metadata
    Users/Profiles
    Object Storage
    Session Storage
    Sensor Data
    Logging Systems
    Record Systems
    Notification Systems
    Use, when & what to use Riak for...
    Wednesday, March 13, 13

    View Slide

  7. IN PRODUCTION AT
    And 1000s more...
    Wednesday, March 13, 13

    View Slide

  8. DATA MODEL
    Wednesday, March 13, 13

    View Slide

  9. {“Key”:“Value”}
    • Values are stored against keys
    • Key/Value + Metadata = Object
    • Fundamental Unit of Replication
    • Any Datatype will work
    • Record to disk in binary format
    Wednesday, March 13, 13

    View Slide

  10. <>/<>
    • Virtual Namespace
    • Bucket + Keys = Object Address
    • Buckets have properties
    • Objects in bucket inherit properties
    • No relationships between buckets
    Wednesday, March 13, 13

    View Slide

  11. DATA ACCESS
    Wednesday, March 13, 13

    View Slide

  12. INTERFACES
    HTTP API - Via a little piece of magic called Webmachine
    Protocol Buffers API - Thanks, Google!
    Largely-faithful REST implementation
    Compact, binary protocol
    Wednesday, March 13, 13

    View Slide

  13. CLIENT LIBS
    Python
    Ruby
    PHP
    OCaml
    Java
    Perl
    Erlang
    Node.js
    C/C++
    Haskell
    Clojure
    Scala
    Go
    Dart
    .NET
    ...and more via Basho
    or our community.
    Wednesday, March 13, 13

    View Slide

  14. RIAK GIVES YOU
    [FOUR] WAYS TO STORE,
    RETRIEVE, AND QUERY
    DATA
    Wednesday, March 13, 13

    View Slide

  15. 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    CRUD
    // PUT
    PUT  /buckets/bucket/keys/key            //  User-­‐defined  key
    POST  /buckets/bucket/keys/key        //  Riak-­‐defined  key
    DELETE  /buckets/bucket/keys/key      
    GET  /buckets/bucket/keys/key        
    // GET
    // DELETE
    Wednesday, March 13, 13

    View Slide

  16. MapReduce
    Distributed processing system using Riak Pipe
    Efficient for targeted queries over known key range
    Write jobs in Erlang or JS. (Erlang more performant)
    Wednesday, March 13, 13

    View Slide

  17. Secondary Indexing (2i)
    riak_object
    riak_object
    X-Riak-Index-email_bin
    X-Riak-Index-value_int
    [email protected]
    “42”
    Tag objects with custom metadata on PUT...
    Exact match and range queries...
    No multi-index queries yet...
    Pagination is on its way...
    Wednesday, March 13, 13

    View Slide

  18. Riak Search
    Store and index documents (JSON, text, XML, etc)
    Current Riak Search supports subset of Solr API
    Next iteration (Yokozuna; in beta)will implement distributed
    Solr on Riak. It will be sexy.
    Looking for beta testers to help harden Yokozuna
    Wednesday, March 13, 13

    View Slide

  19. ARCHITECTURE
    The scaleability and
    ease of operation
    goals inform
    architectural decisions.
    These come with tradeoffs.
    Consistent Hashing
    Virtual Nodes
    Append-only storage
    Handoff/Rebalancing
    Vector Clocks
    Active Anti-Entropy*
    Wednesday, March 13, 13

    View Slide

  20. Consistent Hashing
    Location of data in the Riak ring is determined based on hash of
    bucket + key.
    Provides even distribution of storage and query load
    Trades off advantages gained from locality
    - e.g. Range queries and aggregates
    Wednesday, March 13, 13

    View Slide

  21. Consistent Hashing
    Wednesday, March 13, 13

    View Slide

  22. Virtual Nodes
    Unit of addressing and concurrency in Riak
    Each physical host manage many vnodes
    Partition count / physical machines = vnodes/machine*
    Decouples physical assets from data distribution. This provides:
    - simplicity in cluster sizing
    - failure isolation
    Wednesday, March 13, 13

    View Slide

  23. Handoff/Rebalancing
    Mechanisms for data rebalancing
    When nodes join/leave cluster, handoff and rebalancing
    manage the date shuffling dynamically
    Trades off speed of convergence vs. effects on cluster
    performance
    - causes disk & network load
    Wednesday, March 13, 13

    View Slide

  24. Vector Clocks
    VCs used to rectify object consistency at READ time.
    Lots of knobs to turn; well-documented
    Trades off space, speed, and complexity for safety
    - will store all sibling objects until resolved
    - can lead to object size issues
    Wednesday, March 13, 13

    View Slide

  25. Append-Only Storage
    Riak provides a pluggable backend interface. (Write your own;
    we’ll probably hire you...)
    Bitcask, LevelDB are most-heavily used. Both are
    append - only
    Provides crash safety and speed.
    Trade off: periodic compaction/merge ops
    Wednesday, March 13, 13

    View Slide

  26. RIAK 1.3
    (AKA “new hotness”)
    Active Anti Entropy
    MapReduce Improvements
    IPv6 Support
    Riaknostic included by default
    Much more
    Riak Control improvements
    Full release notes: https://github.com/basho/riak/blob/1.3/RELEASE-NOTES.md
    Wednesday, March 13, 13

    View Slide

  27. FUTURE WORK*
    (1.4 and beyond)
    (* all code subject to ship early, late, or not at all)
    Dynamic Ring Size
    Yokozuna
    CRDTs/Data Types
    Riak Object
    Consistency
    2i Improvements
    Riak Pipe work
    Much more
    Wednesday, March 13, 13

    View Slide

  28. S3-API compatible and supports per-tenant
    reporting for billing and metering use cases.
    Additional APIs on the way.
    Multi-tenant cloud storage software for
    public and private clouds.
    Designed to provide simple, available, distributed
    cloud storage at any scale.
    Stores files of arbitrary size. Under the hood
    stores 1MB chunks along side a manifest.
    Stateless proxy (CS) does chunking. Riak does
    distribution, storage, etc.
    Wednesday, March 13, 13

    View Slide

  29. Data transfer is unidirectional (source -> sink).
    Bidirectional synchronization can be
    achieved by configuring a pair of connections
    between clusters.
    Extends Riak's capabilities with:
    - multi-datacenter replication
    - SNMP Configuration
    - JMX-Monitoring
    - 24x7 support from Basho Engineers
    One cluster acts as a "source cluster". The source
    cluster replicates its data to one or more
    "sink clusters" using either real-time or full sync.
    Wednesday, March 13, 13

    View Slide

  30. RIAK COMMUNITY
    Mailing List - 1300 developers
    IRC - 200+ people every day yelling about software
    GitHub - 1000s of watchers; 200+ contributors to all projects
    Meetups - 10 Countries, 23 Cities, 3700+ Members & growing fast!
    Deployments - 1000s in production.
    Wednesday, March 13, 13

    View Slide

  31. May 13-14th in New York City
    ricon.io/east.html
    Talks, hacking, parties
    Dedicated to the future of Riak and
    distributed systems in production
    REGISTER
    NOW! https://ricon-east-2013.eventbrite.com/?discount=lovevnodes
    Wednesday, March 13, 13

    View Slide

  32. GETTING STARTED
    Downloads - http://docs.basho.com/riak/latest/downloads/
    Docs - http://docs.basho.com
    Riak Source Code - github.com/basho/riak
    All Basho source Code - github.com/basho/
    Riak Mailing List - http://bit.ly/riak-list
    Email or Tweet me @adron or [email protected]
    Wednesday, March 13, 13

    View Slide

  33. Let’s Talk UI & CLI - Demo Things
    Wednesday, March 13, 13

    View Slide

  34. #WHOIS
    Adron Hall | @adron | Coder, Messenger, Recon
    Wednesday, March 13, 13

    View Slide