$30 off During Our Annual Pro Sale. View Details »

Voxxed Keynote 2016: Out of the Fire Swamp

Adrian Colyer
February 25, 2016

Voxxed Keynote 2016: Out of the Fire Swamp

This talk is primarily based around my 'Out of the Fire Swamp' and 'All Change Please' blog posts: http://blog.acolyer.org/2015/09/08/out-of-the-fire-swamp-part-i-the-data-crisis/, http://blog.acolyer.org/2016/01/22/all-change-please/

Adrian Colyer

February 25, 2016
Tweet

More Decks by Adrian Colyer

Other Decks in Technology

Transcript

  1. Out of the
    Fire Swamp
    Adrian Colyer
    @adriancolyer

    View Slide

  2. blog.acolyer.org
    350
    Foundations
    Frontiers

    View Slide

  3. 01
    02
    03
    Questioning your
    Integrity
    The Art of the
    Possible
    All Change
    Please
    The Data Crisis and What we Can Do About it
    Out of the Fire Swamp
    3

    View Slide

  4. The Gold Standard
    Serializable
    4
    t1 t2 t3 t4

    View Slide

  5. Anomalies due to weaker isolation
    5
    Dirty Writes
    P0
    Dirty Reads
    P1
    Cursor Lost Updates
    P4C
    Lost Updates
    P4
    Fuzzy (Non-repeatable) Reads
    P2
    Phantoms
    P3
    Read Skew
    A5A
    Write Skew
    A5B

    View Slide

  6. Write Skew Example
    6

    View Slide

  7. Theory & Practice
    7

    View Slide

  8. Feral Concurrency Control Results
    8

    View Slide

  9. The Gold Standard
    Linearizable
    9

    View Slide

  10. Anomalies due to weaker consistency
    10
    Non-monotonic reads
    L1
    Non-monotonic writes
    L2
    Non-monotonic transactions
    L3
    Not reading your writes
    L4
    Monotonic Reads
    Monotonic Writes
    Writes Follow Reads
    Read Your Writes
    Monotonic Reads + Monotonic Writes + Writes Follow Reads = PRAM
    PRAM + Read Your Writes = Causal Consistency

    View Slide

  11. Non-Monotonic Read Example
    11

    View Slide

  12. Expectations and Reality
    12
    The ALPS

    View Slide

  13. Probabilistically
    Bounded
    Staleness
    13

    View Slide

  14. Frequently Not Supported
    Multi-entity / Multi-partition Transactions?
    14
    t1 t2 t3 t4

    View Slide

  15. View Slide

  16. Eventual Consistency at Google
    16
    “We [also] have a lot of experience with eventual
    consistency systems at Google. In all such
    systems, we find developers spend a significant
    fraction of their time building extremely complex
    and error-prone mechanisms to cope with
    eventual consistency and handle data that may be
    out of date. We think this is an unacceptable
    burden to place on developers and that
    consistency problems should be solved at the
    database level.” - F1: A Distributed SQL Database That Scales (2012)

    View Slide

  17. 01
    02
    03
    Questioning your
    Integrity
    The Art of the
    Possible
    All Change
    Please
    The Data Crisis and What we Can Do About it
    Out of the Fire Swamp
    17

    View Slide

  18. Causal Consistency
    18
    “No consistency stronger than real-time
    causal consistency (RTC) can be provided in
    an always available, one-way convergent
    system, and RTC can be provided in an always
    available one-way convergent system.”

    View Slide

  19. What can’t we protect against assuming HA?
    19
    Cursor Lost Updates
    P4C
    Lost Updates
    P4
    Write Skew
    A5B
    Stale Reads
    Not reading your writes
    L4
    Can provide Read Your
    Writes with sticky
    sessions
    Recency Guarantees

    View Slide

  20. So What Can We Do?
    20
    Memories,
    Guesses, Apologies
    &

    View Slide

  21. and I-Confluence Analysis
    Coordination Avoidance
    21
    TPC-C

    View Slide

  22. 22

    View Slide

  23. 23
    01
    02
    03
    Avoid coordination
    when you can
    Use Causal+ Consistency
    when you can’t
    Detect, and apologise for,
    what you can’t prevent
    + Dimmer Switches

    View Slide

  24. Multi-Partition Transactions at Scale
    24

    View Slide

  25. Computing at the Edge
    25

    View Slide

  26. 01
    02
    03
    Questioning your
    Integrity
    The Art of the
    Possible
    All Change
    Please
    The Data Crisis and What we Can Do About it
    Out of the Fire Swamp
    26

    View Slide

  27. Human
    computers
    at Dryden by NACA (NASA) -
    Dryden Flight Research Center
    Photo Collection
    http://www.dfrc.nasa.
    gov/Gallery/Photo/Places/HTML/E49-54.html.
    Licensed under Public Domain via Commons -
    https://commons.wikimedia.org/wiki/File:
    Human_computers_-_Dryden.jpg#/media/File:
    Human_computers_-_Dryden.jpg

    View Slide

  28. Computing on a Human Scale
    28
    10ns
    70ns
    10ms
    10s
    1:10s
    116d
    Registers
    & L1-L3
    File on
    desk
    Main
    memory
    Office filing
    cabinet
    HDD
    Trip to the
    warehouse

    View Slide

  29. Compute
    HTM
    Persistent Memory NI
    FPGA
    GPUs
    Memory
    NVDIMMs
    Persistent Memory
    Networking
    100GbE
    RDMA
    Storage
    NVMe
    Next-gen NVM
    Next Generation Hardware
    All Change Please
    29

    View Slide

  30. 30

    View Slide

  31. 2-10m
    Computing on a Human Scale
    31
    10s
    1:10s
    116d
    File on
    desk
    Office filing
    cabinet
    Trip to the
    warehouse
    4x capacity
    fireproof local
    filing cabinets
    23-40m
    Phone
    another office
    (RDMA)
    3h20m Next-gen
    warehouse

    View Slide

  32. The New ~Numbers Everyone Should Know
    32
    Latency Bandwidth Capacity/IOPS
    Register 0.25ns
    L1 cache 1ns
    L2 cache 3ns 8MB
    L3 cache 11ns 45MB
    DRAM 62ns 120GBs 6TB - 4 socket
    NVRAM’ DIMM 620ns 60GBs 24TB - 4 socket
    1-sided RDMA in Data Center 1.4us 100GbE ~700K IOPS
    RPC in Data Center 2.4us 100GbE ~400K IOPS
    NVRAM’ NVMe 12us 6GBs 16TB/disk,~2M/600K
    NVRAM’ NVMf 90us 5GBs 16TB/disk, ~700/600K

    View Slide

  33. Low Latency - RAMCloud
    33
    Reads
    5μs
    Writes
    13.5μs
    Transactions
    20μs
    5-object Txns
    27μs
    TPC-C
    33tps

    View Slide

  34. No Compromises - FaRM
    34
    TPC-C (90 nodes)
    4.5M tps
    99%ile
    1.9ms
    KV (per node)
    6.3M qps
    at peak throughput
    41μs

    View Slide

  35. No Compromises
    35
    “This paper demonstrates that new software in modern
    data centers can eliminate the need to compromise. It
    describes the transaction, replication, and recovery
    protocols in FaRM, a main memory distributed computing
    platform. FaRM provides distributed ACID transactions
    with strict serializability, high availability, high
    throughput and low latency. These protocols were
    designed from first principles to leverage two hardware
    trends appearing in data centers: fast commodity
    networks with RDMA and an inexpensive approach to
    providing non-volatile DRAM.”

    View Slide

  36. DrTM
    The Doctor will see you now
    36
    5.5M tps on TPC-C
    6-node cluster.

    View Slide

  37. Some things Change, Some stay the Same
    37

    View Slide

  38. A Brave New World
    38
    Fast RDMA networks +
    Ample Persistent Memory +
    Hardware Transactions +
    Enhanced HW Cache Management +
    Super-fast Storage +
    On-board FPGAs + GPUs + … = ???

    View Slide

  39. 01
    02
    03
    Questioning your
    Integrity
    The Art of the
    Possible
    All Change
    Please
    The Data Crisis and What we Can Do About it
    Out of the Fire Swamp
    39

    View Slide

  40. A new paper every weekday
    Published at http://blog.acolyer.org.
    01
    Delivered Straight to your inbox
    If you prefer email-based subscription to read at
    your leisure.
    02
    Announced on Twitter
    I’m @adriancolyer.
    03
    Go to a Papers We Love Meetup
    A repository of academic computer science papers
    and a community who loves reading them.
    04
    Share what you learn
    Anyone can take part in the great conversation.
    05

    View Slide

  41. THANK YOU !
    @adriancolyer

    View Slide