Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pat Helland and me: A talk about “Life Beyond Distributed Transactions: An Apostate’s Opinion”

Pat Helland and me: A talk about “Life Beyond Distributed Transactions: An Apostate’s Opinion”

In 2007, Pat Helland published “Life Beyond Distributed Transactions: An Apostate’s Opinion,” in which he conducts a thought experiment on how to design a distributed database that can scale almost infinitely. While the paper explicitly addresses distributed database design, Sean T. Allen shows that the ideas are far more widely applicable, particularly in scaling stateful applications. Sean explores some of Helland’s ideas through practical examples from his experience building data processing systems using tools like Apache Storm and, more recently, developing a stateful distributed stream processor at Wallaroo Labs.

Sean T Allen

July 26, 2018
Tweet

More Decks by Sean T Allen

Other Decks in Technology

Transcript

  1. PAT HELLAND AND ME
    A TALK ABOUT “LIFE BEYOND DISTRIBUTED TRANSACTIONS: AN APOSTATE’S OPINION”

    View full-size slide

  2. SEAN T. ALLEN
    VP OF ENGINEERING AT WALLAROO LABS
    MEMBER OF THE PONY CORE TEAM
    AUTHOR OF “STORM APPLIED”
    LOVER OF FRENCH STREET ART
    @SEANTALLEN
    @WALLAROOLABS
    @PONYLANG

    View full-size slide

  3. PAT HELLAND
    AND ME

    View full-size slide

  4. PAT HELLAND
    WRITER OF PAPERS I LOVE

    View full-size slide

  5. PAT HELLAND
    LIFE BEYOND DISTRIBUTED
    TRANSACTIONS

    View full-size slide

  6. WHAT’S IN THIS TALK…

    View full-size slide

  7. SOME AXIOMS…

    View full-size slide

  8. TO SCALE INFINITELY,
    WE HAVE TO SCALE HORIZONTALLY

    View full-size slide

  9. TO SCALE INFINITELY,
    WE MUST AVOID COORDINATION

    View full-size slide

  10. DISTRIBUTED TRANSACTIONS ARE
    A FORM OF COORDINATION

    View full-size slide

  11. THEREFORE…
    TO SCALE INFINITELY,
    WE CAN’T USE TRANSACTIONS

    View full-size slide

  12. WHAT IS SCALING?

    View full-size slide

  13. MORE AND
    MORE THINGS
    BUT, THEY DON’T GET
    BIGGER. THERE’S JUST…
    MORE OF THEM. LOTS MORE.

    View full-size slide

  14. WE SCALE
    ENTITIES
    ENTITIES:
    LIVE ON A SINGLE MACHINE
    AND ARE MANIPULATED
    INDIVIDUALLY

    View full-size slide

  15. WHAT IS AN ENTITY?

    View full-size slide

  16. ENTITIES ARE BOUNDARIES OF
    ATOMICITY

    View full-size slide

  17. Bob
    6 3
    5
    8
    Alice
    4 2

    7
    1

    View full-size slide

  18. Bob
    6 3
    5
    8
    Alice
    4 2

    7
    1

    View full-size slide

  19. Bob
    6 3
    5
    8
    Alice
    4
    7
    1
    2

    View full-size slide

  20. Bob
    6 3
    5
    8
    Alice
    4 2

    7
    1

    View full-size slide

  21. Bob
    6 3
    5
    8
    Alice
    4 2

    7
    1

    View full-size slide

  22. DENORMALIZE..
    ALL THE THINGS!

    View full-size slide

  23. TWO-LAYER ARCHITECTURE

    View full-size slide

  24. scale-agnostic
    scale-aware
    API

    View full-size slide

  25. scale-agnostic
    scale-aware
    API

    View full-size slide

  26. scale-agnostic
    scale-aware
    API

    View full-size slide

  27. scale-agnostic
    scale-aware
    API

    View full-size slide

  28. scale-agnostic
    scale-aware
    API

    View full-size slide

  29. scale-agnostic
    scale-aware
    API

    View full-size slide

  30. TO SCALE INFINITELY,
    YOUR BUSINESS LOGIC HAS TO BE
    INDEPENDENT OF SCALE

    View full-size slide

  31. TWO BIG IDEAS
    A WORLD OF POSSIBILITIES

    View full-size slide

  32. WALLAROO
    SCALE INDEPENDENT
    COMPUTING FOR PYTHON

    View full-size slide

  33. ENTITIES
    BUT WE CALL THEM…
    “STATE OBJECTS”

    View full-size slide

  34. TWO-LAYER
    ARCHITECTURE
    BUT WE CALL IT…
    “SCALE INDEPENDENCE”

    View full-size slide

  35. user supplied logic
    Wallaroo runtime
    Wallaroo API

    View full-size slide

  36. user supplied logic
    Wallaroo runtime
    Wallaroo API

    View full-size slide

  37. user supplied logic
    Wallaroo runtime
    Wallaroo API

    View full-size slide

  38. user supplied logic
    Wallaroo runtime
    Wallaroo API

    View full-size slide

  39. WALLAROO API
    MARKET SPREAD EXAMPLE

    View full-size slide

  40. MARKET SPREAD
    REAL-TIME “SOMETHING AIN’T RIGHT” TRADE CHECKS
    Market Spread
    State
    Market
    Data
    Orders
    Update
    APPL
    Check
    MSFT
    Rejections

    View full-size slide

  41. MARKET SPREAD
    TWO SOURCES OF DATA
    Market Spread
    State
    Market
    Data
    Orders
    Update
    APPL
    Check
    MSFT
    Rejections

    View full-size slide

  42. MARKET SPREAD
    ONE SINK
    Market Spread
    State
    Market
    Data
    Orders
    Update
    APPL
    Check
    MSFT
    Rejections

    View full-size slide

  43. MARKET SPREAD
    ORDER PIPELINE
    Market Spread
    State
    Market
    Data
    Orders
    Update
    APPL
    Check
    MSFT
    Rejections

    View full-size slide

  44. MARKET SPREAD
    MARKET DATA PIPELINE
    Market Spread
    State
    Market
    Data
    Orders
    Update
    APPL
    Check
    MSFT
    Rejections

    View full-size slide

  45. MARKET SPREAD
    APPLICATION DEFINITION

    View full-size slide

  46. APPLICATION DEFINITION
    FLOW OF DATA FROM SOURCE TO SINK

    View full-size slide

  47. TWO DATA PIPELINES
    ORDERS

    View full-size slide

  48. TWO DATA PIPELINES
    MARKET DATA

    View full-size slide

  49. DEFINE OUR SOURCES
    1 PER PIPELINE

    View full-size slide

  50. DEFINE OUR OPERATIONS
    1 PER PIPELINE

    View full-size slide

  51. DEFINE OUR OPERATIONS
    CHECK ORDER AGAINST SYMBOL DATA

    View full-size slide

  52. DEFINE OUR OPERATIONS
    UPDATE SYMBOL DATA WITH LATEST MARKET DATA

    View full-size slide

  53. DEFINE OUR SINKS
    1 PER PIPELINE

    View full-size slide

  54. DEFINE OUR SINKS
    ORDERS PIPELINE MIGHT HAVE OUTPUT

    View full-size slide

  55. DEFINE OUR SINKS
    MARKET DATA ONLY UPDATES SYMBOL DATA- NO OUTPUT

    View full-size slide

  56. SCALE INDEPENDENT
    ONLY FLOW OF DATA AND OPERATIONS

    View full-size slide

  57. USER SUPPLIED
    LOGIC

    View full-size slide

  58. UPDATE MARKET DATA STATE COMPUTATION
    UPDATES SYMBOL DATA STATE

    View full-size slide

  59. WALLAROO
    RUNTIME
    MESH NETWORK OF
    COOPERATING PROCESSES

    View full-size slide

  60. STATE OBJECTS
    ONE BIG MAP?

    View full-size slide

  61. STATE OBJECTS
    CONCEPTUALLY ITS LIKE A BIG MAP
    Market
    Data Update State

    View full-size slide

  62. STATE OBJECTS
    WITH A KEY FOR EACH OBJECT
    APPL IBM
    MSFT AMZN
    INTC NVDA
    Market
    Data Update

    View full-size slide

  63. STATE OBJECTS
    WHERE WE MAY FROM INCOMING DATA’S KEY
    APPL IBM
    MSFT AMZN
    INTC NVDA
    Market
    Data MSFT

    View full-size slide

  64. STATE OBJECTS
    TO THE STATE OBJECT FOR THAT KEY
    APPL IBM
    MSFT AMZN
    INTC NVDA
    Market
    Data MSFT

    View full-size slide

  65. HASH
    PARTITIONING
    DISTRIBUTING STATE
    OBJECTS ACROSS A
    CLUSTER

    View full-size slide

  66. SINGLE WORKER
    ALL SYMBOLS TOGETHER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  67. SINGLE WORKER
    ALL SYMBOLS TOGETHER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  68. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  69. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  70. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  71. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  72. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    MSFT
    IBM

    View full-size slide

  73. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    IBM
    MSFT

    View full-size slide

  74. ADD ANOTHER WORKER
    STATE OBJECTS WILL BE REDISTRIBUTED ACROSS THE CLUSTER
    APPL
    AMZN
    IBM
    MSFT

    View full-size slide

  75. STATE OBJECTS

    View full-size slide

  76. A WALLAROO STATE OBJECT
    PLAIN OLD PYTHON

    View full-size slide

  77. LEARN MORE
    GITHUB.COM/SEANTALLEN/
    PAT-HELLAND-AND-ME

    View full-size slide