Slide 1

Slide 1 text

BUILDING APPLICATIONS with DISTRIBUTED ERLANG cmeik the adventures of hash(<<“author”>>, <<“Christopher Meiklejohn”>>)

Slide 2

Slide 2 text

who am i CMEIK

Slide 3

Slide 3 text

who am i CMEIK

Slide 4

Slide 4 text

who am i CMEIK distributed systems engineer basho technologies

Slide 5

Slide 5 text

who am i CMEIK distributed systems engineer basho technologies researcher with the syncfree project

Slide 6

Slide 6 text

what is THE AGENDA

Slide 7

Slide 7 text

what is THE AGENDA (novice) 1. what is distributed erlang?

Slide 8

Slide 8 text

what is THE AGENDA (novice) 1. what is distributed erlang? (erlanger) 2. where do i go from here?

Slide 9

Slide 9 text

what is DISTRIBUTED ERLANG

Slide 10

Slide 10 text

what is DISTRIBUTED ERLANG EXTENSION TO HELP BUILD DISTRIBUTED SYSTEMS

Slide 11

Slide 11 text

what is DISTRIBUTED ERLANG EXTENSION TO HELP BUILD DISTRIBUTED SYSTEMS

Slide 12

Slide 12 text

what are the goals of a DISTRIBUTED SYSTEM

Slide 13

Slide 13 text

what are the goals of a DISTRIBUTED SYSTEM “A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components.” Wikipedia, “Distributed Computing”

Slide 14

Slide 14 text

what are some examples of a DISTRIBUTED SYSTEM

Slide 15

Slide 15 text

what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc.

Slide 16

Slide 16 text

what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc. master/slave in sql, multi-partition txns

Slide 17

Slide 17 text

what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc.

Slide 18

Slide 18 text

what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc. mobile clients, internet of things

Slide 19

Slide 19 text

why are distributed systems HARD L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 20

Slide 20 text

why are distributed systems HARD the network is reliable L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 21

Slide 21 text

why are distributed systems HARD the network is reliable latency is zero L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 22

Slide 22 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 23

Slide 23 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 24

Slide 24 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 25

Slide 25 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 26

Slide 26 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 27

Slide 27 text

why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero the network is homogeneous L. Peter Deutsch, “Fallacies of Distributed Computing”

Slide 28

Slide 28 text

what is DISTRIBUTED ERLANG

Slide 29

Slide 29 text

what is DISTRIBUTED ERLANG 1 3 2 4 5

Slide 30

Slide 30 text

what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies

Slide 31

Slide 31 text

what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6

Slide 32

Slide 32 text

what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6

Slide 33

Slide 33 text

what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6

Slide 34

Slide 34 text

what is DISTRIBUTED ERLANG

Slide 35

Slide 35 text

what is DISTRIBUTED ERLANG 1 2

Slide 36

Slide 36 text

what is DISTRIBUTED ERLANG 1 spawn functions on other nodes 2 p1 p2

Slide 37

Slide 37 text

what is DISTRIBUTED ERLANG 1 spawn functions on other nodes 2 p1 p2 monitor or link on other nodes

Slide 38

Slide 38 text

what is DISTRIBUTED ERLANG 1 spawn functions on other nodes 2 p1 p2 monitor or link on other nodes 1 2 p1 p2

Slide 39

Slide 39 text

what is DISTRIBUTED ERLANG 1 spawn functions on other nodes 2 p1 p2 monitor or link on other nodes 1 2 p1

Slide 40

Slide 40 text

what distributed erlang gets RIGHT

Slide 41

Slide 41 text

what distributed erlang gets RIGHT assumes unreliable asynchronous message passing

Slide 42

Slide 42 text

what libraries come with DISTRIBUTED ERLANG

Slide 43

Slide 43 text

what libraries come with DISTRIBUTED ERLANG global - global name registration and locking

Slide 44

Slide 44 text

what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry

Slide 45

Slide 45 text

what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions

Slide 46

Slide 46 text

what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel

Slide 47

Slide 47 text

what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel rpc - remote procedure call services

Slide 48

Slide 48 text

what is GLOBAL

Slide 49

Slide 49 text

what is GLOBAL global name registration and locking service

Slide 50

Slide 50 text

what is GLOBAL global name registration and locking service shared state, replicated locally

Slide 51

Slide 51 text

what is GLOBAL global name registration and locking service shared state, replicated locally races under network partitions

Slide 52

Slide 52 text

what is GLOBAL global name registration and locking service shared state, replicated locally races under network partitions provides ad-hoc resolution hook

Slide 53

Slide 53 text

what is PG2

Slide 54

Slide 54 text

what is PG2 distributed process group registry

Slide 55

Slide 55 text

what is PG2 distributed process group registry usually used for work partitioning

Slide 56

Slide 56 text

what is PG2 distributed process group registry usually used for work partitioning races under network partitions

Slide 57

Slide 57 text

what is PG2 distributed process group registry usually used for work partitioning races under network partitions can lose values under network partitions

Slide 58

Slide 58 text

what is PG2 distributed process group registry usually used for work partitioning races under network partitions can lose values under network partitions originally isis inspired, pg descendent

Slide 59

Slide 59 text

what is MNESIA

Slide 60

Slide 60 text

transactional database in erlang what is MNESIA

Slide 61

Slide 61 text

transactional database in erlang implemented using replicated ets tables what is MNESIA

Slide 62

Slide 62 text

transactional database in erlang implemented using replicated ets tables global transactions are *expensive* what is MNESIA

Slide 63

Slide 63 text

transactional database in erlang implemented using replicated ets tables global transactions are *expensive* network partitions can cause values to be lost what is MNESIA

Slide 64

Slide 64 text

what is NET_KERNEL

Slide 65

Slide 65 text

maintenance of network in distributed erlang what is NET_KERNEL

Slide 66

Slide 66 text

maintenance of network in distributed erlang responsible for detecting failures what is NET_KERNEL

Slide 67

Slide 67 text

maintenance of network in distributed erlang responsible for detecting failures dropped tcp connections, network partitions what is NET_KERNEL

Slide 68

Slide 68 text

maintenance of network in distributed erlang responsible for detecting failures dropped tcp connections, network partitions poor mechanism for cluster management what is NET_KERNEL

Slide 69

Slide 69 text

what is RPC

Slide 70

Slide 70 text

remote procedure call services what is RPC

Slide 71

Slide 71 text

remote procedure call services serialized execution of requests what is RPC

Slide 72

Slide 72 text

remote procedure call services serialized execution of requests call to a single node, or multi call what is RPC

Slide 73

Slide 73 text

remote procedure call services serialized execution of requests call to a single node, or multi call synchronous programming pattern what is RPC

Slide 74

Slide 74 text

what are the ANTI-PATTERNS*

Slide 75

Slide 75 text

what are the ANTI-PATTERNS* utilizing shared state

Slide 76

Slide 76 text

what are the ANTI-PATTERNS* utilizing shared state reliance on physical time

Slide 77

Slide 77 text

what are the ANTI-PATTERNS* utilizing shared state reliance on physical time using predesignated masters

Slide 78

Slide 78 text

what are the ANTI-PATTERNS* utilizing shared state reliance on physical time using predesignated masters treating the network as synchronous

Slide 79

Slide 79 text

what are the ANTI-PATTERNS* utilizing shared state reliance on physical time using predesignated masters treating the network as synchronous * also: distributed objects, guaranteed delivery mechanisms, distributed serializable transactions, etc.

Slide 80

Slide 80 text

unfortunately these mechanisms are NAIVE so, what can do we and what have we LEARNED

Slide 81

Slide 81 text

what about CLUSTER MEMBERSHIP

Slide 82

Slide 82 text

what about CLUSTER MEMBERSHIP 1 3 2 4

Slide 83

Slide 83 text

what about CLUSTER MEMBERSHIP fixed membership, a priori 1 3 2 4

Slide 84

Slide 84 text

what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie to net_kernel, net_ticktime 1 3 2 4

Slide 85

Slide 85 text

what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie to net_kernel, net_ticktime store information locally, gossip 1 3 2 4

Slide 86

Slide 86 text

what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie to net_kernel, net_ticktime store information locally, gossip hyparview, plumtree, thicket 1 3 2 4

Slide 87

Slide 87 text

what about FAILURE DETECTION 1 3 2 4

Slide 88

Slide 88 text

what about FAILURE DETECTION detection of delays vs. failed nodes 1 3 2 4

Slide 89

Slide 89 text

what about FAILURE DETECTION detection of delays vs. failed nodes net_kernel, net_ticktime 1 3 2 4

Slide 90

Slide 90 text

what about FAILURE DETECTION detection of delays vs. failed nodes net_kernel, net_ticktime the φ accrual failure detector 1 3 2 4

Slide 91

Slide 91 text

what about FAILURE DETECTION detection of delays vs. failed nodes net_kernel, net_ticktime the φ accrual failure detector swim: membership and failure detector 1 3 2 4

Slide 92

Slide 92 text

what about VALUE DIVERGENCE

Slide 93

Slide 93 text

what about VALUE DIVERGENCE replicated data can diverge 1 1

Slide 94

Slide 94 text

what about VALUE DIVERGENCE replicated data can diverge 1 3 2 1

Slide 95

Slide 95 text

what about VALUE DIVERGENCE replicated data can diverge 1 3 2 1 ?

Slide 96

Slide 96 text

what about VALUE DIVERGENCE replicated data can diverge identify concurrency 1 3 2 1 ?

Slide 97

Slide 97 text

what about VALUE DIVERGENCE replicated data can diverge lamport clock, vector clocks, version vectors, wall clock identify concurrency 1 3 2 1 ?

Slide 98

Slide 98 text

what about VALUE DIVERGENCE replicated data can diverge lamport clock, vector clocks, version vectors, wall clock identify concurrency how to resolve? lww vs. siblings vs. crdt 1 3 2 1 ?

Slide 99

Slide 99 text

what about DISTRIBUTION BUFFERS

Slide 100

Slide 100 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2

Slide 101

Slide 101 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2

Slide 102

Slide 102 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking

Slide 103

Slide 103 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking move objects to tcp sockets

Slide 104

Slide 104 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size

Slide 105

Slide 105 text

what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size beware unbounded queues

Slide 106

Slide 106 text

what about LEADER ELECTION

Slide 107

Slide 107 text

what about LEADER ELECTION 1 2 3 4

Slide 108

Slide 108 text

what about LEADER ELECTION gen_leader assumes fixed topology 1 2 3 4

Slide 109

Slide 109 text

what about LEADER ELECTION gen_leader assumes fixed topology use paxos or raft 1 2 3 4

Slide 110

Slide 110 text

what about LEADER ELECTION gen_leader assumes fixed topology use paxos or raft gen_leader known to deadlock 1 2 3 4

Slide 111

Slide 111 text

what about MESSAGE FORMATS

Slide 112

Slide 112 text

what about MESSAGE FORMATS v1 v2

Slide 113

Slide 113 text

what about MESSAGE FORMATS mixed version clusters reality in large systems v1 v2

Slide 114

Slide 114 text

what about MESSAGE FORMATS mixed version clusters reality in large systems keep formats compatible v1 v2

Slide 115

Slide 115 text

what about MESSAGE FORMATS mixed version clusters reality in large systems keep formats compatible upgrade to new format v1 v2

Slide 116

Slide 116 text

what about MESSAGE DELIVERY

Slide 117

Slide 117 text

what about MESSAGE DELIVERY message delivery not guaranteed with failures

Slide 118

Slide 118 text

what about MESSAGE DELIVERY message delivery not guaranteed with failures 1 2

Slide 119

Slide 119 text

what about MESSAGE DELIVERY message delivery not guaranteed with failures 1 2 1 2 3 4

Slide 120

Slide 120 text

what about MESSAGE DELIVERY message delivery not guaranteed with failures 1 2 1 1 2 2 3 1 4 3 4

Slide 121

Slide 121 text

what can we take away from this DISCUSSION

Slide 122

Slide 122 text

what are the LESSONS

Slide 123

Slide 123 text

distributed erlang gets you part of the way but, you still have to understand the problems and the tradeoffs what are the LESSONS

Slide 124

Slide 124 text

distributed erlang gets you part of the way but, you still have to understand the problems and the tradeoffs what are the LESSONS http://christophermeiklejohn.com/distributed/systems/ 2013/07/12/readings-in-distributed-systems.html

Slide 125

Slide 125 text

what are some useful ERLANG LIBRARIES

Slide 126

Slide 126 text

what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core

Slide 127

Slide 127 text

what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble

Slide 128

Slide 128 text

what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble riak_dt: conflict-free replicated data types https://github.com/basho/riak_dt

Slide 129

Slide 129 text

what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble riak_dt: conflict-free replicated data types https://github.com/basho/riak_dt riak_test: riak_core testing framework https://github.com/basho/riak_test

Slide 130

Slide 130 text

what are some useful RESEARCH PROJECTS

Slide 131

Slide 131 text

what are some useful RESEARCH PROJECTS syncfree: large-scale computation on erlang https://github.com/syncfree

Slide 132

Slide 132 text

what are some useful RESEARCH PROJECTS syncfree: large-scale computation on erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/

Slide 133

Slide 133 text

what are some useful RESEARCH PROJECTS syncfree: large-scale computation on erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/ paraphrase: parallel computation https://github.com/paraphrase

Slide 134

Slide 134 text

what are some PUBLICATIONS

Slide 135

Slide 135 text

what are some PUBLICATIONS

Slide 136

Slide 136 text

what are some PUBLICATIONS

Slide 137

Slide 137 text

what are some PUBLICATIONS

Slide 138

Slide 138 text

what are some PUBLICATIONS

Slide 139

Slide 139 text

what are some PUBLICATIONS

Slide 140

Slide 140 text

what are some PUBLICATIONS pid reuse unreliable failure detectors unreliable delivery of messages …and more!

Slide 141

Slide 141 text

do you have any questions? THANKS!

Slide 142

Slide 142 text

do you have any questions? THANKS!