Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic{ON} 2018 - Reliable by design - Applying formal methods to distributed systems

Elastic Co
March 01, 2018

Elastic{ON} 2018 - Reliable by design - Applying formal methods to distributed systems

Elastic Co

March 01, 2018
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Reliable by Design Applying Formal Methods to Distributed Systems David

    Turner @DaveCTurner Yannick Welsch @ywelsch
  2. Design is thinking 2

  3. Design is documentation 3

  4. Richard Guindon { } Writing is nature’s way of letting

    you know how sloppy your thinking is.
  5. Leslie Lamport { } Mathematics is nature’s way of letting

    you know how sloppy your writing is.
  6. Mathematical tools 6

  7. Millions of states 7

  8. 8 Model Checking Interactive Theorem Proving • Exhaustive search •

    Finite state space • Accessible • Detailed argument • Arbitrary state space • More specialised Flavours
  9. CONSENSUS

  10. Asynchronous system 10

  11. 11 Safety Liveness Nothing bad happens Something good eventually happens

    Properties
  12. Majorities 12

  13. 13 TLA+ Node1 Node2 Node3 • combines temporal logic and

    set theory • specification defines initial state and next-state relation • states represented by assigning values to variables
  14. 14 Next-state relation Node n firstUncommittedSlot: s currentTerm: t ...

    PublishResponse{ ... } Node n firstUncommittedSlot: s currentTerm: t lastAcceptedTerm: t lastAcceptedValue: v ... PublishRequest - dest: n - slot: s - term: t - value: v PublishResponse - slot: s - term: t
  15. \* next-state relation Next == \/ HandlePublishRequest \/ HandlePublishResponse \/

    HandleClientRequest \/ SomeNodeCrashes \/ ... \* main safety property StateMachineSafety == \A n1, n2 \in Nodes : firstUncommittedSlot[n1] = firstUncommittedSlot[n2] => /\ currentClusterState[n1] = currentClusterState[n2] /\ currentConfiguration[n1] = currentConfiguration[n2] 15 Full specification • network behavior • node failures • client submitting values • next-state relation • safety property
  16. TLC • model checker • integrated into IDE • exhaustive

    state exploration • breadth-first • bounded state space • bugs even for small models • good at finding edge cases
  17. Isabelle/HOL • interactive proof assistant • needs guidance • tracks

    proof goals • fully automatically verifies proof
  18. Experiences TLA+ Isabelle/HOL • executable specs • rapid prototyping •

    high confidence • rising in popularity • no state-space limitations • deep insights • even higher confidence
  19. Where can I learn more about this? More Questions? Visit

    us at the AMA • TLA+ Home Page: http://lamport.azurewebsites.net/tla/tla.html • TLA+ Video Course: http://lamport.azurewebsites.net/video/videos.html • Introduction to TLA+: https://learntla.com • Tutorial on Isabelle/HOL: http://isabelle.in.tum.de/doc/tutorial.pdf • Use of Formal Methods at AWS: http://lamport.azurewebsites.net/tla/formal-methods-amazon.pdf • Formal models of core Elasticsearch algorithms: https://github.com/elastic/elasticsearch-formal-models • Related talk at 3:30pm (Salon 1-7): Elasticsearch Consensus: The Past, the Present, and the Future
  20. www.elastic.c o

  21. 21 Please attribute Elastic with a link to elastic.co