Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2018: "What Ever Happened To Durability?"

Tom Lyon
August 28, 2018

2018: "What Ever Happened To Durability?"

Presentation for the 2018 Mountpoint conference in Vancouver. And a love song to Apache Pulsar and Bookkeeper.

Tom Lyon

August 28, 2018
Tweet

More Decks by Tom Lyon

Other Decks in Technology

Transcript

  1. Durability Defined §  If written data is acknowledged, it must

    be forever readable §  If written data is read once [before it is acknowledged], it must be forever readable
  2. Nothing is Forever §  Hardware eventually fails §  Software eventually

    (?) works §  Durability is a matter of degree §  What is good enough?
  3. Performance is the Enemy §  “The only good write is

    an O_SYNC write” §  Write-behind, caching, background compaction/migration can all lead to hidden errors §  fsync(2) can and should return errors, but misses some §  See https://wiki.postgresql.org/wiki/Fsync_Errors §  PostgreSQL: Caring about durability since 1986 §  “commit intervals”?
  4. Can’t trust a File System “We analyze 11 applications, and

    find 60 vulnerabilities, some of which result in severe consequences like corruption or data loss.”
  5. Can’t trust an SSD ‘Surprisingly, we find that 13 out

    of the 15 devices, including the supposedly “enterprise-class” devices, exhibit failure behavior contrary to our expectations’
  6. Servers and Mayflies §  Back in the day, when “the”

    computer crashed, you just waited for repair §  Now you remove or re-image the server – with the drives §  Local durability is really hard, but no longer adequate
  7. Replication §  Backups? Not timely §  Synchronous mirroring? Very expensive

    §  Just use the network! Make copies! Go forth and replicate! §  Losing a disk or server no longer causes lost data. Right? Who needs fsync?
  8. Correlated Failures §  AWS can lose a data center, you

    can too §  Rack power problems are common §  The smaller your cluster, the more vulnerable it is https://xkcd.com/1737
  9. CAP Theorem §  You will have Partitioning. §  You must

    choose between Availability and Consistency. §  Your users will hate your choice. §  Availability can be improved by brute force and $$$ - to reduce partitioning. §  Consistency requires consensus.
  10. Jepsen breaks everything “Use Zookeeper. It’s mature, well-designed, and battle-tested.”

    “The etcd and Consul teams both take consistency seriously…” Kyle Kingsbury, https://jepsen.io
  11. Logs & Journals §  Application first writes to log, then

    to where the data “really lives” §  FS writes to journal, then to where the data “really lives” §  Device writes to log, then to where the data “really lives” §  What if “the truth” “really lived” in the log? §  The other places become read caches
  12. Table and Stream Duality §  “A table is just a

    cache of the latest value for each key in a stream” – P. Helland §  Logs are great for streaming data §  What if the log itself is distributed and allows many writers and readers?
  13. Streaming Systems §  Apache Kafka §  60 second “commit interval?”

    §  Apache Pulsar §  Uses Apache Bookkeeper §  Distributed Logs: §  Apache DistributedLog – uses Bookkeeper §  Facebook LogDevice
  14. Apache Bookkeeper™ §  “A scaleable, fault-tolerant, and low- latency storage

    service optimized for real-time workloads” §  Guarantees: §  “If an entry has been acknowledged, it must be readable” §  “If an entry has been read once, it must always be readable”
  15. Bookkeeper Components §  Client-side library §  Distributed Ledger Abstraction § 

    “Bookie” – very simple storage nodes §  Bookies do NOT talk to each other §  Zookeeper coordination, consensus, cluster membership, and quorums
  16. Planet Java §  Zookeeper and Bookkeeper are both from planet

    Java §  How about something more friendly to Planet Linux? §  Use etcd, rewrite Bookkeeper like ScyllaDB did for Cassandra?
  17. Take-aways §  Durability is Hard §  Distributed Durability is Very

    Hard §  Be Up-Front about your durability model §  Logs as Truth & Streaming are the future §  Apache Bookkeeper is awesome §  Don’t re-invent the wheel!