Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Erlang - Building Blocks for Global Distributed Systems

Sean Cribbs
February 11, 2015

Erlang - Building Blocks for Global Distributed Systems

Erlang is a computer language and runtime originally created for high-reliability telephony applications. Ericsson created Erlang in the late 1980’s to build Asynchronous Transfer Mode (ATM) switches. These switches made possible the broadband we know today. In fact, British Telecom, a client of Ericsson, reported that the switches attained nine 9's reliability.

The Erlang-based system's careful language/runtime design, live upgrades, and interactive inspection of the running system all made this possible. Recently, a resurgence in the use of Erlang is due to its use in Internet-scale distributed systems.

In this talk, we'll discuss the motivations behind Erlang, the features that distinguish it from other software platforms, and the ways it simplifies building highly-available globally distributed services.

Sean Cribbs

February 11, 2015
Tweet

More Decks by Sean Cribbs

Other Decks in Technology

Transcript

  1. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Erlang: Building Blocks for Global Distributed Systems Sean Cribbs @seancribbs Chicago ACM 11 February 2015
  2. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary About Me Senior Engineer at Basho, Makers of Riak Erlanger since 2008 Distributed Systems, Web Architecture, Compilers
  3. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Outline Why Erlang? Working in Erlang Basics OTP Runtime Building Distributed Systems in Erlang Erlang Distributed Systems in Industry Open Source Services and Proprietary
  4. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Section 1 Why Erlang?
  5. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Context Ericsson CS Lab, mid-late 1980s PLEX (proprietary) and C Complicated and error-prone
  6. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Requirements Isolate faults (bugs) Limit downtime Soft-realtime Simple
  7. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Requirements Fault isolation All software will have bugs! Don’t share memory, send messages Treat values as immutable “Let it crash”
  8. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Requirements Fault isolation All software will have bugs! Don’t share memory, send messages Treat values as immutable “Let it crash” Limit downtime Watch components Restart after failures Don’t retry forever Live console interaction Load new code without restarting
  9. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Requirements Soft-realtime Low-latency Eager evaluation Prevent starvation via pre-emption Virtual machine and emulator
  10. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Requirements Soft-realtime Low-latency Eager evaluation Prevent starvation via pre-emption Virtual machine and emulator Simple High-level Functional / declarative Simple, composable data types Strong abstractions around edges
  11. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Erlang begins... Named after Agner Krarup Erlang and short for “Ericsson Language” Joe Armstrong, Mike Williams, Bjarne Däcker, Robert Virding Initial versions in Prolog, later reimplemented in C
  12. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Erlang begins... Named after Agner Krarup Erlang and short for “Ericsson Language” Joe Armstrong, Mike Williams, Bjarne Däcker, Robert Virding Initial versions in Prolog, later reimplemented in C YouTube: “Erlang the Movie”
  13. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary AXD-301 Erlang Success Story AXE series was colossal failure AXD-301 started in 1996 3MLoC: Erlang 500KLoC: C 13KLoC: Java Separate user plane from control plane BT reports nine 9s
  14. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Post AXD-301 After AXD-301 launch, Ericsson banned Erlang from internal projects. Moved to open-source shortly after (1999?). After open-source: Distributed Erlang Binaries & bit-syntax Async threads HiPE & Dialyzer SMP Native functions (NIFs) Maps Dirty schedulers
  15. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Section 2 Working in Erlang
  16. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Working in Erlang Live coding time!
  17. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Key Takeaways Simple datatypes Dynamic typing Pattern matching Immutable data Functional Cheap processes Message-passing Hot code-loading
  18. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary OTP Runtime loop(V) -> receive {set, X} -> loop(X); {get, Pid} -> Pid ! V, loop(V); end.
  19. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary OTP Runtime loop(V) -> receive {set, X} -> loop(X); {get, Pid} -> Pid ! V, loop(V); end. Problems Manual tail-call over receive Limited extensibility Difficult to inspect externally No interface for caller
  20. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary OTP Runtime Patterns Server Finite State Machine Event handler Supervisor Application Release -module(myserver). -behavior(gen_server). init([]) -> {ok, 0}. handle_call(get, From, Value) -> {reply, Value, Value}. handle_cast({set, New}, _Value) -> {noreply, New}. handle_info(_Msg, Value) -> {noreply, Value}.
  21. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary OTP Runtime Libraries Data structures Embedded databases (ets, dets, mnesia) Operating system services Input, output, encodings Monitoring and logging External interfaces (Java, C) Internet Services (HTTP, FTP, SSH) GUI toolkits (wx, gs) Tracing and debugging Unit and Functional Testing
  22. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Section 3 Building Distributed Systems in Erlang
  23. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Why Erlang for Distributed Systems? A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. – Leslie Lamport
  24. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Why Erlang for Distributed Systems? Fault-tolerance from low-level to high-level Great networking support Distributed Erlang (location transparency) Uniform application structure and patterns Easy to express solutions to distributed problems
  25. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Why Erlang for Distributed Systems? Fault-tolerance from low-level to high-level Great networking support Distributed Erlang (location transparency) Uniform application structure and patterns Easy to express solutions to distributed problems Caveats Queue management is hard (unbounded) Messages can be lost! Best components often third-party or RYO Ericsson is super-conservative
  26. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Pattern: Survey Survey pattern FSM stages
  27. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Pattern: Survey -module(survey_fsm). -behaviour(gen_fsm). -record(state, {caller, nodes=[], workers=[], replies=[]}). start_link(Nodes) -> gen_fsm:start_link(?MODULE, [self(), Nodes], []). init([Caller, Nodes]) -> {ok, distribute, #state{caller=Caller, nodes=Nodes}, 0}.
  28. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Pattern: Survey distribute(timeout, #state{nodes=Nodes}=State) -> Workers = [ spawn(Node, ?MODULE, do_work, [self()]) || Node <- Nodes ], {next_state, collect, State#state{workers=Workers}}. do_work(FSM) -> gen_fsm:send_event(FSM, {reply, crypto:rand_bytes(10)}).
  29. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Pattern: Survey collect({reply, Result}, #state{workers=Workers, replies=Replies0}=State0) -> Replies = [Result|Replies0], State = State0#state{replies=Replies}, if length(Replies) == length(Workers) -> {next_state, finish, State, 0}; true -> {next_state, collect, State} end.
  30. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Pattern: Survey finish(timeout, #state{caller=Caller, replies=Replies}=State) -> Caller ! {survey, Replies}, {stop, normal, State}.
  31. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Section 4 Erlang Distributed Systems in Industry
  32. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Riak & Riak Core http://docs.basho.com Dynamo-like Key-Value store HA and SC modes CRDTs Search, Secondary Indexes, MapReduce Multi-Datacenter (license only) Riak Core Dynamo, abstracted Cluster membership Partition ownership Virtual nodes (vnodes) Handoff Coverage planning Cluster metadata over gossip
  33. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Riak CS S3 and Swift interface Block storage over Riak Usage accounting Remote fetch (over licensed MDC)
  34. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Disco http://discoproject.org MapReduce system, like Hadoop Originally developed at Nokia Write jobs in Python, Erlang distributes them
  35. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Project FiFo https://project-fifo.net/ Cloud orchestration for SmartOS (Illumos) “Private cloud” / IaaS Uses Riak Core for some components Includes LeoFS for storage (S3-like) Multi-datacenter capability
  36. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary WhatsApp Mobile messaging app Recently purchased by Facebook for $19Bn Scaled to 2 Million connections per machine! Originally based on ejabberd, but quickly became custom
  37. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary OpenX ® Online Advertising Network Real-time Bidding Impressions Ad Delivery Monitoring Also use Riak
  38. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Chef formerly Opscode Configuration Management, much code in Ruby Web Services in Erlang: Chef Server 12 Analytics Other projects in the works
  39. Erlang & Distributed Systems Sean Cribbs Motivation Erlang Basics OTP

    Runtime Erlang in DistSys Use-Cases Open Source Services and Proprietary Thanks Francesco Cesarini, Erlang Solutions Heinz Gies, Project FiFo Rick Reed, WhatsApp Anthony Molinaro, OpenX Joe DeVivo, Chef