Deterministic Parallel and Distributed Programming with Clojure

Deterministic Parallel and Distributed Programming with Clojure

Quick Intro to deterministic parallelism. Practical examples in Clojure. LVar(s), CRDT(s), Bloom, Derflow and others.

B9b7a5ffa24e2af6f877a7950461ba0f?s=128

Oleksii Kachaiev

July 03, 2014
Tweet

Transcript

  1. Deterministic Parallel and Distributed Programming with Clojure Quick Intro Alexey

    Kachayev, 2014
  2. About me • CTO at Attendify.com • Clojure, Erlang, Go,

    Haskell • Fn.py library author • CPython & Storm contributor
  3. Find me •@kachayev •github.com/kachayev •kachayev <$> gmail.com

  4. Topic

  5. Will talk •Parallel & Distributed •Determinism: why & when •Models

    and approaches
  6. Clojure & Concurrency

  7. Atom

  8. Agent

  9. STM

  10. core.async

  11. Deterministic

  12. Non-Deterministic

  13. Why Determinism? • easy to reason about • easy to

    maintain • less bugs • less bugs that you can’t reproduce on your machine • less data losses (no data losses?) • provable correctness
  14. Why Determinism? • you should know why determinism is good

    if you listen to Clojure conference talks • we will talk about "ordering non-determinism" only (there’re many other reasons however)
  15. Parallel & Distributed

  16. Parallel • > 1 independent workers (actors?) • loose coordination

    • great opportunities • … at a high price
  17. Distribution • More parallelism (!) • For lower latency •

    For storage replication • For HA • And more … but more non-determinism factors • Unpredictable latencies • "Random" workers crash
  18. Deterministic distributed program? lets think…

  19. Use consensus, Luke!

  20. Consensus • Non-distributed: locks, semaphores etc • Distributed: 2p-commit, paxos,

    zab, raft • … but this price is too high in many cases! • … you’re trading availability in most cases
  21. Is there any other way?

  22. Monotonicity

  23. Theory • "Logic and Lattices for Distributed Programming" http://goo.gl/5q7CJF •

    "CRDTs: Consistency without concurrency control" http://goo.gl/Ouu4sc • "A comprehensive study of Convergent and Commutative Replicated Data Types" http://goo.gl/I1alMi • "A Lattice-Theoretical Approach to Deterministic Parallelism with Shared State" http://goo.gl/cdv1UK
  24. Theory • Monotonic logic • Bounded Join Semi-Lattices

  25. Practice • mobile application • chat message stream • k-ordered

    message IDs • "latest viewed message" mark • offline-mode support
  26. Practice

  27. Practice

  28. BTW…. we don’t know global events ordering in practice :(

  29. Monotonicity!

  30. LVar • Haskell library lvish • Monotonically growing, lattice-based data

    structures • determinism VS. quasi-determinism • threshold reads • freezing variables
  31. LVar

  32. CRDT(s)

  33. CRDT • Conflict-Free Replicated Data Type • Convergent Replicated Data

    Type • Commutative Replicated Data Type
  34. CRDT: The Idea

  35. CRDT • Counters (G-Counter, PN-Counter) • Registers (LWW-Register, MV-Register) •

    Sets (G-Set, 2P-Set, PN-Set, OR-Set) • Graphs
  36. Knockbox

  37. Dataflow programming

  38. Bloom • disorderly programming • state represented with lattices (few

    built-in) and collections (table, scratch, channel) • runtime implementation as Ruby DSL • static analysis tools (points of order) • visualisation tools
  39. Bloom

  40. Derflow • Deterministic dataflow programming • Growing set of single-assignment

    variables • Operations: declare, bind, read/wait • Streams: produce/consume • Erlang implementation: http://goo.gl/lnnfVd
  41. Summary

  42. Why Clojure? • strong concurrency primitives (atom) • immutable data

    types • CRDT library "knockbox" (dead?) • not that much done for distributed computing (riak_core in Erlang) • one can use Akka/Pulsar • aphyr/jepsen for testing partition tolerance • a big room for experiments
  43. Links • "Eventually Consistent Data Structures" http://goo.gl/HgtIzY • "Knockbox, an

    Eventual Consistency Toolkit" http://goo.gl/r5XxRH • "LVars: lattice-based data structures for deterministic parallelism" http:// goo.gl/IJljEQ • "MVar, IVar, and LVar programs in Haskell" http://goo.gl/dF36k4 • "Distributed deterministic dataflow programming for Erlang" http://goo.gl/ y2oH0P • "Sync Free" http://goo.gl/qXZHnb
  44. Learn Clojure For Great Good

  45. Learn Haskell For Great Good

  46. Q/A thanks for your attention,