Building Applications with Distributed Erlang

Building Applications with Distributed Erlang

NDC London, 2014

3e09fee7b359be847ed5fa48f524a3d3?s=128

Christopher Meiklejohn

December 04, 2014
Tweet

Transcript

  1. BUILDING APPLICATIONS with DISTRIBUTED ERLANG cmeik the adventures of hash(<<“author”>>,

    <<“Christopher Meiklejohn”>>)
  2. who am i CMEIK

  3. who am i CMEIK

  4. who am i CMEIK distributed systems engineer basho technologies

  5. who am i CMEIK distributed systems engineer basho technologies researcher

    with the syncfree project
  6. what is THE AGENDA

  7. what is THE AGENDA (novice) 1. what is distributed erlang?

  8. what is THE AGENDA (novice) 1. what is distributed erlang?

    (erlanger) 2. where do i go from here?
  9. what is DISTRIBUTED ERLANG

  10. what is DISTRIBUTED ERLANG EXTENSION TO HELP BUILD DISTRIBUTED SYSTEMS

  11. what is DISTRIBUTED ERLANG EXTENSION TO HELP BUILD DISTRIBUTED SYSTEMS

  12. what are the goals of a DISTRIBUTED SYSTEM

  13. what are the goals of a DISTRIBUTED SYSTEM “A distributed

    system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components.” Wikipedia, “Distributed Computing”
  14. what are some examples of a DISTRIBUTED SYSTEM

  15. what are some examples of a DISTRIBUTED SYSTEM distributed databases,

    riak, cassandra, etc.
  16. what are some examples of a DISTRIBUTED SYSTEM distributed databases,

    riak, cassandra, etc. master/slave in sql, multi-partition txns
  17. what are some examples of a DISTRIBUTED SYSTEM distributed databases,

    riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc.
  18. what are some examples of a DISTRIBUTED SYSTEM distributed databases,

    riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc. mobile clients, internet of things
  19. why are distributed systems HARD L. Peter Deutsch, “Fallacies of

    Distributed Computing”
  20. why are distributed systems HARD the network is reliable L.

    Peter Deutsch, “Fallacies of Distributed Computing”
  21. why are distributed systems HARD the network is reliable latency

    is zero L. Peter Deutsch, “Fallacies of Distributed Computing”
  22. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite L. Peter Deutsch, “Fallacies of Distributed Computing”
  23. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite the network is secure L. Peter Deutsch, “Fallacies of Distributed Computing”
  24. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite the network is secure topology doesn’t change L. Peter Deutsch, “Fallacies of Distributed Computing”
  25. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator L. Peter Deutsch, “Fallacies of Distributed Computing”
  26. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero L. Peter Deutsch, “Fallacies of Distributed Computing”
  27. why are distributed systems HARD the network is reliable latency

    is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero the network is homogeneous L. Peter Deutsch, “Fallacies of Distributed Computing”
  28. what is DISTRIBUTED ERLANG

  29. what is DISTRIBUTED ERLANG 1 3 2 4 5

  30. what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent

    • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies
  31. what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent

    • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6
  32. what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent

    • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6
  33. what is DISTRIBUTED ERLANG 1 3 2 4 5 transparent

    • message passing • links • monitors transitive connections (except hidden nodes) access control via cookies 6
  34. what is DISTRIBUTED ERLANG

  35. what is DISTRIBUTED ERLANG 1 2

  36. what is DISTRIBUTED ERLANG 1 spawn functions on other nodes

    2 p1 p2
  37. what is DISTRIBUTED ERLANG 1 spawn functions on other nodes

    2 p1 p2 monitor or link on other nodes
  38. what is DISTRIBUTED ERLANG 1 spawn functions on other nodes

    2 p1 p2 monitor or link on other nodes 1 2 p1 p2
  39. what is DISTRIBUTED ERLANG 1 spawn functions on other nodes

    2 p1 p2 monitor or link on other nodes 1 2 p1
  40. what distributed erlang gets RIGHT

  41. what distributed erlang gets RIGHT assumes unreliable asynchronous message passing

  42. what libraries come with DISTRIBUTED ERLANG

  43. what libraries come with DISTRIBUTED ERLANG global - global name

    registration and locking
  44. what libraries come with DISTRIBUTED ERLANG global - global name

    registration and locking pg2 - process group registry
  45. what libraries come with DISTRIBUTED ERLANG global - global name

    registration and locking pg2 - process group registry mnesia - distributed transactions
  46. what libraries come with DISTRIBUTED ERLANG global - global name

    registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel
  47. what libraries come with DISTRIBUTED ERLANG global - global name

    registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel rpc - remote procedure call services
  48. what is GLOBAL

  49. what is GLOBAL global name registration and locking service

  50. what is GLOBAL global name registration and locking service shared

    state, replicated locally
  51. what is GLOBAL global name registration and locking service shared

    state, replicated locally races under network partitions
  52. what is GLOBAL global name registration and locking service shared

    state, replicated locally races under network partitions provides ad-hoc resolution hook
  53. what is PG2

  54. what is PG2 distributed process group registry

  55. what is PG2 distributed process group registry usually used for

    work partitioning
  56. what is PG2 distributed process group registry usually used for

    work partitioning races under network partitions
  57. what is PG2 distributed process group registry usually used for

    work partitioning races under network partitions can lose values under network partitions
  58. what is PG2 distributed process group registry usually used for

    work partitioning races under network partitions can lose values under network partitions originally isis inspired, pg descendent
  59. what is MNESIA

  60. transactional database in erlang what is MNESIA

  61. transactional database in erlang implemented using replicated ets tables what

    is MNESIA
  62. transactional database in erlang implemented using replicated ets tables global

    transactions are *expensive* what is MNESIA
  63. transactional database in erlang implemented using replicated ets tables global

    transactions are *expensive* network partitions can cause values to be lost what is MNESIA
  64. what is NET_KERNEL

  65. maintenance of network in distributed erlang what is NET_KERNEL

  66. maintenance of network in distributed erlang responsible for detecting failures

    what is NET_KERNEL
  67. maintenance of network in distributed erlang responsible for detecting failures

    dropped tcp connections, network partitions what is NET_KERNEL
  68. maintenance of network in distributed erlang responsible for detecting failures

    dropped tcp connections, network partitions poor mechanism for cluster management what is NET_KERNEL
  69. what is RPC

  70. remote procedure call services what is RPC

  71. remote procedure call services serialized execution of requests what is

    RPC
  72. remote procedure call services serialized execution of requests call to

    a single node, or multi call what is RPC
  73. remote procedure call services serialized execution of requests call to

    a single node, or multi call synchronous programming pattern what is RPC
  74. what are the ANTI-PATTERNS*

  75. what are the ANTI-PATTERNS* utilizing shared state

  76. what are the ANTI-PATTERNS* utilizing shared state reliance on physical

    time
  77. what are the ANTI-PATTERNS* utilizing shared state reliance on physical

    time using predesignated masters
  78. what are the ANTI-PATTERNS* utilizing shared state reliance on physical

    time using predesignated masters treating the network as synchronous
  79. what are the ANTI-PATTERNS* utilizing shared state reliance on physical

    time using predesignated masters treating the network as synchronous * also: distributed objects, guaranteed delivery mechanisms, distributed serializable transactions, etc.
  80. unfortunately these mechanisms are NAIVE so, what can do we

    and what have we LEARNED
  81. what about CLUSTER MEMBERSHIP

  82. what about CLUSTER MEMBERSHIP 1 3 2 4

  83. what about CLUSTER MEMBERSHIP fixed membership, a priori 1 3

    2 4
  84. what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie

    to net_kernel, net_ticktime 1 3 2 4
  85. what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie

    to net_kernel, net_ticktime store information locally, gossip 1 3 2 4
  86. what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie

    to net_kernel, net_ticktime store information locally, gossip hyparview, plumtree, thicket 1 3 2 4
  87. what about FAILURE DETECTION 1 3 2 4

  88. what about FAILURE DETECTION detection of delays vs. failed nodes

    1 3 2 4
  89. what about FAILURE DETECTION detection of delays vs. failed nodes

    net_kernel, net_ticktime 1 3 2 4
  90. what about FAILURE DETECTION detection of delays vs. failed nodes

    net_kernel, net_ticktime the φ accrual failure detector 1 3 2 4
  91. what about FAILURE DETECTION detection of delays vs. failed nodes

    net_kernel, net_ticktime the φ accrual failure detector swim: membership and failure detector 1 3 2 4
  92. what about VALUE DIVERGENCE

  93. what about VALUE DIVERGENCE replicated data can diverge 1 1

  94. what about VALUE DIVERGENCE replicated data can diverge 1 3

    2 1
  95. what about VALUE DIVERGENCE replicated data can diverge 1 3

    2 1 ?
  96. what about VALUE DIVERGENCE replicated data can diverge identify concurrency

    1 3 2 1 ?
  97. what about VALUE DIVERGENCE replicated data can diverge lamport clock,

    vector clocks, version vectors, wall clock identify concurrency 1 3 2 1 ?
  98. what about VALUE DIVERGENCE replicated data can diverge lamport clock,

    vector clocks, version vectors, wall clock identify concurrency how to resolve? lww vs. siblings vs. crdt 1 3 2 1 ?
  99. what about DISTRIBUTION BUFFERS

  100. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2

  101. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2

  102. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large

    objects can cause head-of-line blocking
  103. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large

    objects can cause head-of-line blocking move objects to tcp sockets
  104. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large

    objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size
  105. what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large

    objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size beware unbounded queues
  106. what about LEADER ELECTION

  107. what about LEADER ELECTION 1 2 3 4

  108. what about LEADER ELECTION gen_leader assumes fixed topology 1 2

    3 4
  109. what about LEADER ELECTION gen_leader assumes fixed topology use paxos

    or raft 1 2 3 4
  110. what about LEADER ELECTION gen_leader assumes fixed topology use paxos

    or raft gen_leader known to deadlock 1 2 3 4
  111. what about MESSAGE FORMATS

  112. what about MESSAGE FORMATS v1 v2

  113. what about MESSAGE FORMATS mixed version clusters reality in large

    systems v1 v2
  114. what about MESSAGE FORMATS mixed version clusters reality in large

    systems keep formats compatible v1 v2
  115. what about MESSAGE FORMATS mixed version clusters reality in large

    systems keep formats compatible upgrade to new format v1 v2
  116. what about MESSAGE DELIVERY

  117. what about MESSAGE DELIVERY message delivery not guaranteed with failures

  118. what about MESSAGE DELIVERY message delivery not guaranteed with failures

    1 2
  119. what about MESSAGE DELIVERY message delivery not guaranteed with failures

    1 2 1 2 3 4
  120. what about MESSAGE DELIVERY message delivery not guaranteed with failures

    1 2 1 1 2 2 3 1 4 3 4
  121. what can we take away from this DISCUSSION

  122. what are the LESSONS

  123. distributed erlang gets you part of the way but, you

    still have to understand the problems and the tradeoffs what are the LESSONS
  124. distributed erlang gets you part of the way but, you

    still have to understand the problems and the tradeoffs what are the LESSONS http://christophermeiklejohn.com/distributed/systems/ 2013/07/12/readings-in-distributed-systems.html
  125. what are some useful ERLANG LIBRARIES

  126. what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit

    https://github.com/basho/riak_core
  127. what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit

    https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble
  128. what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit

    https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble riak_dt: conflict-free replicated data types https://github.com/basho/riak_dt
  129. what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit

    https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble riak_dt: conflict-free replicated data types https://github.com/basho/riak_dt riak_test: riak_core testing framework https://github.com/basho/riak_test
  130. what are some useful RESEARCH PROJECTS

  131. what are some useful RESEARCH PROJECTS syncfree: large-scale computation on

    erlang https://github.com/syncfree
  132. what are some useful RESEARCH PROJECTS syncfree: large-scale computation on

    erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/
  133. what are some useful RESEARCH PROJECTS syncfree: large-scale computation on

    erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/ paraphrase: parallel computation https://github.com/paraphrase
  134. what are some PUBLICATIONS

  135. what are some PUBLICATIONS

  136. what are some PUBLICATIONS

  137. what are some PUBLICATIONS

  138. what are some PUBLICATIONS

  139. what are some PUBLICATIONS

  140. what are some PUBLICATIONS pid reuse unreliable failure detectors unreliable

    delivery of messages …and more!
  141. do you have any questions? THANKS!

  142. do you have any questions? THANKS!