what are the goals of a DISTRIBUTED SYSTEM “A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components.” Wikipedia, “Distributed Computing”
what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc.
what are some examples of a DISTRIBUTED SYSTEM distributed databases, riak, cassandra, etc. master/slave in sql, multi-partition txns web services via rest, soap, etc. mobile clients, internet of things
why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure L. Peter Deutsch, “Fallacies of Distributed Computing”
why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change L. Peter Deutsch, “Fallacies of Distributed Computing”
why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator L. Peter Deutsch, “Fallacies of Distributed Computing”
why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero L. Peter Deutsch, “Fallacies of Distributed Computing”
why are distributed systems HARD the network is reliable latency is zero bandwidth is infinite the network is secure topology doesn’t change there is one administrator transport cost is zero the network is homogeneous L. Peter Deutsch, “Fallacies of Distributed Computing”
what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions
what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel
what libraries come with DISTRIBUTED ERLANG global - global name registration and locking pg2 - process group registry mnesia - distributed transactions net_kernel - erlang distributed networking kernel rpc - remote procedure call services
what is GLOBAL global name registration and locking service shared state, replicated locally races under network partitions provides ad-hoc resolution hook
what is PG2 distributed process group registry usually used for work partitioning races under network partitions can lose values under network partitions
what is PG2 distributed process group registry usually used for work partitioning races under network partitions can lose values under network partitions originally isis inspired, pg descendent
transactional database in erlang implemented using replicated ets tables global transactions are *expensive* network partitions can cause values to be lost what is MNESIA
maintenance of network in distributed erlang responsible for detecting failures dropped tcp connections, network partitions poor mechanism for cluster management what is NET_KERNEL
what are the ANTI-PATTERNS* utilizing shared state reliance on physical time using predesignated masters treating the network as synchronous * also: distributed objects, guaranteed delivery mechanisms, distributed serializable transactions, etc.
what about CLUSTER MEMBERSHIP fixed membership, a priori don’t tie to net_kernel, net_ticktime store information locally, gossip hyparview, plumtree, thicket 1 3 2 4
what about FAILURE DETECTION detection of delays vs. failed nodes net_kernel, net_ticktime the φ accrual failure detector swim: membership and failure detector 1 3 2 4
what about VALUE DIVERGENCE replicated data can diverge lamport clock, vector clocks, version vectors, wall clock identify concurrency how to resolve? lww vs. siblings vs. crdt 1 3 2 1 ?
what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size
what about DISTRIBUTION BUFFERS 1 2 p1 p3 p2 large objects can cause head-of-line blocking move objects to tcp sockets increase distribution port buffer size beware unbounded queues
distributed erlang gets you part of the way but, you still have to understand the problems and the tradeoffs what are the LESSONS http://christophermeiklejohn.com/distributed/systems/ 2013/07/12/readings-in-distributed-systems.html
what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble
what are some useful ERLANG LIBRARIES riak_core: distributed systems toolkit https://github.com/basho/riak_core riak_ensemble: generic multi-paxos framework https://github.com/basho/riak_ensemble riak_dt: conflict-free replicated data types https://github.com/basho/riak_dt
what are some useful RESEARCH PROJECTS syncfree: large-scale computation on erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/
what are some useful RESEARCH PROJECTS syncfree: large-scale computation on erlang https://github.com/syncfree release: large-scale erlang deployments https://github.com/release-project/ paraphrase: parallel computation https://github.com/paraphrase