$30 off During Our Annual Pro Sale. View Details »

DevOps @ Basho

DevOps @ Basho

Slides from my talk at the NYC DevOps meetup on Aug 23, discussing Riak, and the nature of the DevOps team at Basho.

Tom Santero

August 23, 2012
Tweet

More Decks by Tom Santero

Other Decks in Technology

Transcript

  1. DevOps @ Basho
    NYC DevOps Meetup - August 23, 2012

    View Slide

  2. $

    View Slide

  3. whoami
    $

    View Slide

  4. whoami
    $
    Name: Tom Santero
    $
    Title: Technical Evangelist
    Company: Basho Technologies
    Twitter: @tsantero

    View Slide

  5. whoami
    $
    Name: Tom Santero
    $ ./presentation
    Title: Technical Evangelist
    Company: Basho Technologies
    Twitter: @tsantero

    View Slide

  6. for the next 45
    minutes we’ll be
    discussing...

    View Slide

  7. View Slide

  8. just kidding.

    View Slide

  9. • Overview of Riak
    • Nature of DevOps @ Basho
    • What has Basho learned from having a
    DevOps team
    Tonight’s Agenda

    View Slide

  10. View Slide

  11. need to know:
    • Written in Erlang/OTP
    • distributed, key/value store + extras
    • advanced query features
    • pre/post commit hooks
    • pluggable backend storage engines
    • open source (Apache v2.0)

    View Slide

  12. View Slide

  13. key

    View Slide

  14. key value

    View Slide

  15. key value
    bucket

    View Slide

  16. key value
    bucket
    key value
    key value
    key value

    View Slide

  17. bucket/key

    View Slide

  18. node
    node node
    node
    node

    View Slide

  19. View Slide

  20. request quorums
    replicas
    fallbacks
    consistent hashing

    View Slide

  21. Consistent Hashing

    View Slide

  22. Consistent Hashing
    • 160-bit integer keyspace
    0
    2160/2
    2160/4

    View Slide

  23. Consistent Hashing
    • 160-bit integer keyspace
    • divided into !xed number
    of evenly-sized partitions
    32 partitions
    0
    2160/2
    2160/4

    View Slide

  24. Consistent Hashing
    • 160-bit integer keyspace
    • divided into !xed number
    of evenly-sized partitions
    • partitions are claimed by
    nodes in the cluster 32 partitions
    node 0
    node 1
    node 2
    node 3
    0
    2160/2
    2160/4

    View Slide

  25. Consistent Hashing
    • 160-bit integer keyspace
    • divided into !xed number
    of evenly-sized partitions
    • partitions are claimed by
    nodes in the cluster
    • replicas go to the N
    partitions following the
    key
    node 0
    node 1
    node 2
    node 3

    View Slide

  26. Consistent Hashing
    • 160-bit integer keyspace
    • divided into !xed number
    of evenly-sized partitions
    • partitions are claimed by
    nodes in the cluster
    • replicas go to the N
    partitions following the
    key
    node 0
    node 1
    node 2
    node 3
    hash(“meetups/nycdevops”)
    N=3

    View Slide

  27. Disaster Scenario

    View Slide

  28. Disaster Scenario
    • node fails
    X
    X
    X
    X
    X
    X
    X
    X

    View Slide

  29. Disaster Scenario
    • node fails
    • requests go to fallback
    X
    X
    X
    X
    X
    X
    X
    X
    hash(“meetups/nycdevops”)

    View Slide

  30. Disaster Scenario
    • node fails
    • requests go to fallback
    • node comes back
    hash(“meetups/nycdevops”)

    View Slide

  31. Disaster Scenario
    • node fails
    • requests go to fallback
    • node comes back
    • “Hando"” - data returns
    to recovered node
    hash(“meetups/nycdevops”)

    View Slide

  32. Disaster Scenario
    • node fails
    • requests go to fallback
    • node comes back
    • “Hando"” - data returns
    to recovered node
    • normal operations
    resume
    hash(“meetups/nycdevops”)

    View Slide

  33. easy to scale

    View Slide

  34. Erlang/OTP

    View Slide

  35. pre/post commit hooks

    View Slide

  36. Bitcask
    LevelDB
    Memory

    View Slide

  37. full-text search
    secondary indexes
    map/reduce

    View Slide

  38. DevOps @ Basho

    View Slide

  39. DevOps Team
    • Brief History:
    • uno#cially created in Oct 2011
    • o#cially recognized in June 2012
    • pilot program (to be discussed)

    View Slide

  40. Internal
    Customers
    Product

    View Slide

  41. Internal

    View Slide

  42. DevOps - Internal
    • mix of programming and utilities
    • automate all the things!
    • instrumental in improving Riak Test Suite
    • 1,000s of separate tests
    • various con!gs / platforms

    View Slide

  43. guttersnipe

    View Slide

  44. guttersnipe
    • middleware in Vagrant
    • allows us to:
    • deploy + run entire test suite
    • no need to rebuild entire clusters
    • 500% increase in e#ciency

    View Slide

  45. Chef Cookbooks

    View Slide

  46. Chef Cookbooks
    • open source:
    • $ git clone [email protected]:basho/riak-chef-cookbook.git
    • over 30 commits in 2012
    • deploy riak
    • future:
    • simplify use
    • rolling upgrades, autocon!g, iptables

    View Slide

  47. Erlang Template Helper
    • also open source:
    • $ git clone [email protected]:basho/erlang_template_helper.git
    • written by Dan Reverri
    • specify Erlang con!g and args !les in JSON

    View Slide

  48. ETH Example
    $ irb
    >> require ‘erlang_template_helper’
    => true
    >> args = Eth::Args.new({“-name” => “[email protected]”, “-env” => {“ERL_MAX_PORTS” => 4096}})
    => -name [email protected] -env ERL_MAX_PORTS 4096
    >> puts args.pp
    -name [email protected]
    -env ERL_MAX_PORTS 4096
    => nil

    View Slide

  49. ETH Example
    $ .bin/config_to_json multi_backend.config -p
    {
    "riak_kv": {
    "storage_backend": "riak_kv_multi_backend",
    "multi_backend_default": "first_backend",
    "multi_backend": [
    ["__tuple", "first_backend", "riak_kv_bitcask_backend", {
    "data_root": "__string_/var/lib/riak/bitcask"}],
    ["__tuple", "second_backend", "riak_kv_leveldb_backend", {
    "data_root": "__string_/var/lib/riak/leveldb"}]
    ]
    }
    }

    View Slide

  50. Investing in Tech

    View Slide

  51. View Slide

  52. View Slide

  53. DataCenter Specs
    • 300 cpu cores
    • 500 TB storage
    • ~1 TB RAM
    • gigabit internet connection

    View Slide

  54. Product Engineering

    View Slide

  55. DevOps - Product
    • Basho Engineers all have test boxes
    • Now DevOps delegates DC resources
    • Examples of Product Enhancements:

    View Slide

  56. Example #1
    pushing the limits with socket performance
    4gbps over 8 erl sockets

    View Slide

  57. Example #2
    General Performance Testing and
    performing full-sync DC replication at scale

    View Slide

  58. Testing repl at Scale
    • full-sync repl
    • bottleneck in how keys were being read
    • lexicographical vs disk-ordered
    • used the Boston datacenter to prototype !x
    • currently in review to be merged into Riak EE

    View Slide

  59. Customer Service

    View Slide

  60. DevOps - CliServ
    • replicate customer issues /w datacenter
    • secondary datacenter:
    • potential customers ship us hardware
    • spec out + provide best implementations

    View Slide

  61. Example #3
    replicating customer issues with replication

    View Slide

  62. Realtime repl Throuput
    • provisioned 5 machines
    • constrained bandwidth and increase latency
    between clusters
    • but not among nodes in the same cluster
    (‘management’ network as WAN approx)
    • discovered + eliminated several bottlenecks
    • customer achieved 10x jump in performance

    View Slide

  63. “In both cases I was unable to replicate
    these issues on the hardware available
    to me locally, both through limited
    machines (2) and major performance
    di!erences between nodes ("rst gen
    mbp vs i7 dell tower)”
    --Andrew Thompson, Basho Engineer

    View Slide

  64. View Slide

  65. RiakCS
    • released March 27, 2012
    • shipped RiakCS v1.1 on Tuesday
    • S3-compatible cloud storage
    • built on Riak
    • multi-tenancy, multi billing, etc...

    View Slide

  66. Riak CS
    Large
    Object
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node

    View Slide

  67. Riak CS
    Large
    Object
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node

    View Slide

  68. Riak CS
    Large
    Object
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    1mb 1mb
    1mb 1mb

    View Slide

  69. Riak CS
    Large
    Object
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    1mb 1mb
    1mb 1mb

    View Slide

  70. Riak CS
    Large
    Object
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak CS
    Reporting
    API
    S3 API
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    Riak
    Node
    1mb 1mb
    1mb 1mb

    View Slide

  71. complete test harness for RiakCS

    View Slide

  72. shameless plugs:
    • Basho is Hiring DevOps Engineers:
    • work remotely
    • hack on cool shit
    • help make Basho and Riak better
    • send CV to Sean Carey [email protected]

    View Slide

  73. http://ricon2012.com
    When and where?
    Wednesday, October 10 through Thursday, October 11
    at the W Hotel in downtown San Francisco.

    View Slide

  74. Questions?

    View Slide