Deterministic Parallel and
Distributed Programming
with Clojure
Quick Intro
Alexey Kachayev, 2014
Slide 2
Slide 2 text
About me
• CTO at Attendify.com
• Clojure, Erlang, Go, Haskell
• Fn.py library author
• CPython & Storm contributor
Slide 3
Slide 3 text
Find me
•@kachayev
•github.com/kachayev
•kachayev <$> gmail.com
Slide 4
Slide 4 text
Topic
Slide 5
Slide 5 text
Will talk
•Parallel & Distributed
•Determinism: why & when
•Models and approaches
Slide 6
Slide 6 text
Clojure &
Concurrency
Slide 7
Slide 7 text
Atom
Slide 8
Slide 8 text
Agent
Slide 9
Slide 9 text
STM
Slide 10
Slide 10 text
core.async
Slide 11
Slide 11 text
Deterministic
Slide 12
Slide 12 text
Non-Deterministic
Slide 13
Slide 13 text
Why Determinism?
• easy to reason about
• easy to maintain
• less bugs
• less bugs that you can’t reproduce on your machine
• less data losses (no data losses?)
• provable correctness
Slide 14
Slide 14 text
Why Determinism?
• you should know why determinism is good if
you listen to Clojure conference talks
• we will talk about "ordering non-determinism"
only (there’re many other reasons however)
Slide 15
Slide 15 text
Parallel &
Distributed
Slide 16
Slide 16 text
Parallel
• > 1 independent workers (actors?)
• loose coordination
• great opportunities
• … at a high price
Slide 17
Slide 17 text
Distribution
• More parallelism (!)
• For lower latency
• For storage replication
• For HA
• And more … but more non-determinism factors
• Unpredictable latencies
• "Random" workers crash
Slide 18
Slide 18 text
Deterministic
distributed program?
lets think…
Slide 19
Slide 19 text
Use consensus, Luke!
Slide 20
Slide 20 text
Consensus
• Non-distributed: locks, semaphores etc
• Distributed: 2p-commit, paxos, zab, raft
• … but this price is too high in many cases!
• … you’re trading availability in most cases
Slide 21
Slide 21 text
Is there any other way?
Slide 22
Slide 22 text
Monotonicity
Slide 23
Slide 23 text
Theory
• "Logic and Lattices for Distributed Programming" http://goo.gl/5q7CJF
• "CRDTs: Consistency without concurrency control" http://goo.gl/Ouu4sc
• "A comprehensive study of Convergent and Commutative Replicated Data
Types" http://goo.gl/I1alMi
• "A Lattice-Theoretical Approach to Deterministic Parallelism with Shared
State" http://goo.gl/cdv1UK
Slide 24
Slide 24 text
Theory
• Monotonic logic
• Bounded Join Semi-Lattices
Slide 25
Slide 25 text
Practice
• mobile application
• chat message stream
• k-ordered message IDs
• "latest viewed message" mark
• offline-mode support
Slide 26
Slide 26 text
Practice
Slide 27
Slide 27 text
Practice
Slide 28
Slide 28 text
BTW….
we don’t know global events
ordering in practice :(
Why Clojure?
• strong concurrency primitives (atom)
• immutable data types
• CRDT library "knockbox" (dead?)
• not that much done for distributed computing (riak_core in Erlang)
• one can use Akka/Pulsar
• aphyr/jepsen for testing partition tolerance
• a big room for experiments
Slide 43
Slide 43 text
Links
• "Eventually Consistent Data Structures" http://goo.gl/HgtIzY
• "Knockbox, an Eventual Consistency Toolkit" http://goo.gl/r5XxRH
• "LVars: lattice-based data structures for deterministic parallelism" http://
goo.gl/IJljEQ
• "MVar, IVar, and LVar programs in Haskell" http://goo.gl/dF36k4
• "Distributed deterministic dataflow programming for Erlang" http://goo.gl/
y2oH0P
• "Sync Free" http://goo.gl/qXZHnb