Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Erlang

Introduction to Erlang

Elegance always matters, especially when creating software: why should distributed, sophisticated real-time systems be an exception?

This introduction describes Erlang - its syntax and ideas.

Gianluca Costa

July 14, 2016
Tweet

More Decks by Gianluca Costa

Other Decks in Programming

Transcript

  1. Gianluca Costa
    Introduction to Erlang
    http://gianlucacosta.info/

    View Slide

  2. Part 1
    The journey begins

    View Slide

  3. Preface
    Elegance always matters, especially when creating
    software: why should distributed, sophisticated real-time
    systems be an exception?
    This introduction describes Erlang - its syntax and ideas - with
    no claim of completeness: for further information and details,
    please refer to the online documentation and books.

    View Slide

  4. Special thanks
    Special thanks to:

    Professor Paolo Bellavista, for his valuable advice and
    suggestions

    Francesco Cesarini and Simon Thompson, authors of
    the book «Erlang Programming» which, together with
    Erlang’s documentation, inspired this work

    Ericsson, for inventing such an interesting language and
    making it open source

    View Slide

  5. Is something not C++ fast enough?

    «[...]if your target system is a high-level, concurrent, robust, soft real-time
    system that will scale in line with demand, make full use of multicore
    processors, and integrate with components written in other languages,
    Erlang should be your choice» (cit. “Erlang Programming”)

    Case studies revealed that Erlang systems proved to be more performant
    and stable than the equivalent C++ versions, especially under heavy loads

    In a range of contexts, the gist of the problem is finding smart algorithms,
    not focusing on bit-layer performances

    Erlang is a language designed to tackle real-world problems with
    minimalism and elegance

    View Slide

  6. Brief history

    1980s: researchers at Ericsson’s Computer Science
    Laboratory, after analyzing several languages, start creating a
    new one
    – The first Erlang VM was Prolog-based – which explains a lot of
    similarities between Erlang and Prolog

    1991: first release of the C-based Erlang VM:
    BEAM = Björn’s Erlang Abstract Machine

    1992//1994: first commercial project written in Erlang

    1996: introduction of the OTP (=Open Telecom Platform)
    framework

    1998: Erlang becomes open source

    View Slide

  7. Why Erlang?
    Cross-platform
    Virtual Machine
    Functional
    Declarative
    Pattern matching
    Immutable structures Higher-order functions
    List manipulation
    Optimized tail recursion
    OTP middleware
    libraries
    Per-process garbage
    collection
    Lightweight processes
    Advanced IPC
    Code patching
    Bit sequence patterns
    Visual debugger
    Visual
    Process
    manager
    Interactive shell
    Minimalist
    Open source
    Interoperability with
    other ecosystems
    Transparent SMP
    Easy inter-node
    communication
    Dynamic, but compiled
    EDoc documentation

    View Slide

  8. Installing Erlang

    On Ubuntu, the fastest way is running:
    sudo apt-get install erlang

    On Windows, the easiest way is the setup wizard
    To start the interactive shell, run: erl
    To execute the compiler, run: erlc
    In the shell, commands end with “.”, so multiline commands
    are supported

    View Slide

  9. Part 2
    Basic syntax

    View Slide

  10. Language overview

    Erlang is a dynamically-typed language – in
    particular, you don’t need to declare variable types

    However, Erlang is a compiled language,
    therefore source files must be explicitly compiled –
    which prevents runtime syntax errors

    Erlang’s syntax and basic ideas are fairly similar to
    Prolog’s; one of the first and most important
    differences is that computation output is expressed
    by the return value of functions, not by parameters.

    View Slide

  11. Integers

    Integers have no bound. Internally:
    – By default, to enhance performances, they are stored in a word
    – Longer integers are slower, but their size is only constrained by the available memory

    An integer constant can be defined as follows:
    [+|-][#]Value
    – Base can be 2//16; if omitted, 10 is assumed
    – Value is expressed via digits and, if Base > 10, via letters A to F

    Examples: 8, -90, 16#EA3, -8#32

    Integer division: div (infix operator) → example: 13 div 5

    Integer remainder: rem (infix operator) → example: 13 rem 5

    View Slide

  12. Floats

    Float = floating point number, as described by
    IEEE754-1985 standard

    Examples: 9.07, -1.0e12

    Usually slower than integers

    The division operator / always returns a float

    As usual, operations mixing integers and floats
    upcast their operands to float

    Integers and float are classified as numbers

    View Slide

  13. ASCII characters

    There is not a char type: instead, you can express an
    integer using its ASCII character via a dedicated notation:
    $
    Examples:
    – $A is the integer value 65
    – $\n is the integer value 10

    Consequently, on the shell, single characters are printed
    out as numbers

    View Slide

  14. Atoms

    Atom = constant literal standing for itself

    To define an atom, just use it in code:
    – Without quotes, it must match this regex: [a-z][A-Za-z0-9_@]*
    Examples: true, ok, helloWorld, error@point
    – Otherwise, it must be within single quotes
    Examples: ‘EXIT’, ‘spaces CAN be included’

    Memory imprint of an atom is constant – whatever its length

    Optimized for equality check and pattern matching

    Atoms are not garbage-collected – beware of memory leaks if
    you arbitrarily generate, via dedicated built-in functions, a lot of
    atoms over the time!

    View Slide

  15. Boolean values

    true and false are just atoms – and they happen to be
    returned, for example, by comparison and logic operators

    Comparison operators:
    – <, >, >=, =< (à la Prolog)
    – ==, /= → equal/not equal, disregarding the type
    – =:=, =/= → type-checking equal/not equal (efficient!)

    Logical operators: and, andalso (short-circuits), or,
    orelse (short-circuits), xor, not

    View Slide

  16. Guards

    Guard = expression returning true or false that can only contain:
    – Bound variables and constants
    – Arithmetic, comparison, logic, bitwise operators
    – A few built-in functions
    – Compatible preprocessor macros

    User-defined functions are not allowed, to prevent side-effects –
    because Erlang is functional, but not purely functional

    Guard expressions can be combined using:
    – Comma (,), to join expressions using logic and
    – Semicolon (;), to join expressions using logic or

    Should a runtime error occur while evaluating a guard, it is silently
    caught and the guard returns false

    View Slide

  17. Tuples

    Tuple = collection of items designed to be:
    – Heterogeneous → you can mix elements of all types
    – Immutable
    – Arbitrarily nestable
    – Fixed-size

    To define a tuple, just employ:
    – {item1, item2, …, itemN}
    – {} → empty tuple

    When the first item in a tuple is an atom, the atom is called tag and the tuple is
    tagged → tagged tuples are especially useful to:
    – Exchange messages between processes
    – Return multiple values from a function

    However, in lieu of tagged tuples, consider using records (see later)

    View Slide

  18. Lists

    Like tuples, lists are heterogeneous, immutable,
    nestable collections of items

    Lists are also fixed-size, but they are meant to be
    processed – it is fairly common to obtain shorter or
    longer lists via manipulation

    To define a proper list:
    – [item1, item2, …, itemN]
    – [] → empty list
    – [head item 1, head item2, … | tail] → where tail is another
    proper list

    View Slide

  19. List operations

    length(AList) returns the number of items in the given list

    ++ (infix) concatenates 2 lists. As usual, it is O(N) with respect
    to the length of the first list, so use it carefully; often, it is more
    efficient to prepend items to the result (which is O(1)), so as to
    reverse it in the end

    -- (infix) removes every item in the rightside list (at most once
    per occurrency) from the leftside list

    ++ and -- are right-associative – of course, you can force
    precedence via parentheses

    Lists support usual pattern matching, especially [Head | Tail] or
    the other constructors mentioned previously

    View Slide

  20. Strings

    Strings in Erlang are simply lists of integers

    There are two important bonus effects:
    – Double-quoted strings are supported: “Hi” is shortcut for [$H, $i], which in
    turn is shortcut for [72, 105]
    – Lists of integers that can be mapped to printable ASCII characters will be
    printed out as double-quoted strings by the Erlang shell

    Unlike atoms, the string length affects its memory imprint: when
    handling huge strings, you might consider using binaries instead

    The compiler automatically concatenates string literals as if they
    were one: “Pandas are” “ cute animals” == “Pandas are cute
    animals”.
    Never use ++ to concatenate string constants.

    View Slide

  21. List comprehensions

    Concise notations to express a pipeline of functional map’s and
    filter’s on lists:
    [itemTerm || generator|filter, generator|filter, …]
    – itemTerm: expression denoting the generic item of the result list. Can
    combine variables provided by any generator
    – generator: construct of the form pattern <- sourceList –
    most often, it will be Variable <- sourceList
    – filter: a guard that can employ any variable bound by generators to its
    left

    Only items satisfying both generator patterns and filter guards
    participate in the creation of the new list

    List comprehensions can be nested and can reference (or even
    shadow) variables in outer comprehensions

    View Slide

  22. List comprehensions - Examples
    Be L1 = [5, 8, 10, 29] and L2 = [9, 10, 4, 0, 73, 2]
    Then:

    [X || X <- L1, X > 6] → [8, 10, 29]

    [{X, Y} || X <- L1, Y <- L2} is the cartesian
    product L1xL2 → [{5,9}, {5, 10}, {5, 4}, …]

    [X + 3 – Y || X <- L1, Y <- L2, Y < (X + 5)] → in
    this case, the filter on Y references X

    View Slide

  23. References

    Reference = a term that is unique in an Erlang
    runtime system

    To create a reference, use make_ref(), usually
    assigning its return value to a variable

    References are especially employed in
    message passing, in particular to
    unambiguously identify the reply to a given
    request sent

    View Slide

  24. Comments

    Erlang only supports single-line comments,
    starting with %

    Comments can also be parsed by dedicated
    tools providing, for example, autocompletion
    and smart suggestions

    View Slide

  25. Variables

    Variables always start with an uppercase letter or an
    underscore

    Single-assignment principle: any variable can be assigned a
    value once only per scope

    Variables have no type declaration – Erlang is dynamically
    typed

    Assignment is just a special case of pattern matching –
    actually, = is called match operator

    Variables can be assigned once in the shell, too; however, one
    can use f() to unbind them all, or f(Var) to unbind just the given
    Var.

    View Slide

  26. Pattern matching

    Similar to other functional languages:
    Pattern = Term
    where:
    – Pattern can include variables (bound and unbound) and constants terms
    – Term cannot include unbound variables

    Pattern and Term can have nested structures – which, of course, should match

    If matching succeeds, the unbound variables on the left get bound to the
    corresponding terms on the right

    There are also catch-all patterns:
    – _ (the underscore, called anonymous variable) always matches but is never bound
    – _Var → “don’t care” variables: they work like other variables, but the compiler won’t issue a
    warning if they are used in the pattern only. They are declared for the sake of clarity

    View Slide

  27. Case

    case is a conditional construct applying unification:
    case expression of
    pattern1 [when guard1] -> expr1.1, …, expr1.N
    1
    ;
    (…)
    patternM [when guardM] -> exprM.1, …, exprM.N
    M
    end

    Branches are checked from top to bottom, and only the first branch matching
    expression (and having guard missing or evaluated to true) is selected

    If no pattern matches, a runtime error occurs; to prevent this, you can use a
    catch-all (an unbound variable such as Default -> or the wildcard _ ->)

    case returns the last expression of the selected branch (so, you could have
    MyVar = case ...)

    View Slide

  28. If

    if is a conditional construct based on guards:
    if
    guard1 -> expr1.1, …, expr1.N
    1
    ;
    (…)
    guardM -> exprM.1, …, expr M.N
    M
    end

    The first guard, from top to bottom, evaluated to true determines
    the chosen branch, whose latest expression becomes the overall
    value of the if construct

    If no branch is selected, a runtime error occurs; the catchall
    guard is, of course, true ->

    View Slide

  29. Functions

    Erlang supports overloading based on arity (=number of
    parameters): functions with equal name but different arity are
    unrelated

    A function consists of clauses: a local function f having N
    parameters is referred to as f/N and is declared as follows:
    – f(arg1.1, …, arg1.N) [when guard 1] ->
    expr1.1, …, expr1.M
    1
    ;
    f(arg2.1, …, arg2.N) [when guard 2] ->
    expr2.1, …, expr2.M
    2
    ;
    f(argZ.1, …, argZ.N) [when guard Z] ->
    exprZ.1, …, exprZ.M
    Z
    .

    Clauses end with “;”, except the last one, ending with “.”

    Every argument can be an arbitrary pattern: so, unlike the case
    construct, function heads can match multiple values at once

    View Slide

  30. Functions - Examples

    f(A) -> A + 1.

    f(A, B) -> A + B.

    factorial(0) -> 1;
    factorial(N) when N > 0 -> N * factorial(N -1).
    → this function has 2 clauses – and one of them employes a guard

    f({A, B}, C, D) ->
    Delta = C * D,
    {A + Delta, B + Delta}.
    →in this case, the first argument is a tuple, so binding is performed on its 2
    variables.The result is a tuple as well

    View Slide

  31. Clause order

    Clauses are checked from top to bottom: the first clause
    whose head (=pattern to the left of ->) matches the arguments
    - and whose guard is missing or true - is selected for execution

    If no head matches, a runtime error occurs

    Consequently, arguments are actually passed via pattern
    matching.

    Erlang performs call-by-value: all arguments are evaluated
    before calling a function

    The result of a function is the last expression of the selected
    clause body (=sequence of expressions to the right of ->)

    View Slide

  32. Recursion

    Functional languages like Erlang or Haskell do
    not provide destructive iteration constructs such
    as for, while, repeat

    Erlang is based on:
    – Recursive functions
    – Library functions hiding the destructive aspect of the
    accumulation process – such as foldl

    View Slide

  33. Tail recursion

    In almost every functional language, tail
    recursion is provided a dedicated optimization:
    – Tail-recursive functions run like a C for loop –
    requiring constant memory, as no recursive calls are
    internally performed by the VM

    In Erlang, tail recursion is not necessarily better
    than non-tail recursion, as the latter is very
    optimized, too

    View Slide

  34. Modules

    Modules are .erl text files, whose structure is determined by attribute
    declarations:
    ➢ -module(moduleName): moduleName must be the file name without extension
    ➢ -vsn(version): the module’s version. If missing, a checksum is used
    ➢ -export([name/arity, name/arity, …]): list of functions that can be called from
    outside the module
    ➢ -import(modName, [name/arity, …]): functions exported by another module that
    can be called as if they were local functions

    Custom module attributes can be added – such as
    -author(“AuthorName”)

    Module:module_info/0 and /1 return metainfo for the given Module

    View Slide

  35. Employing modules

    Unlike languages such as Python, modules must be
    explicitly compiled – for example, via the shell’s
    c(ModuleName) function or via the erlc compiler.

    Bytecode modules have .beam extension

    Erlang finds modules in a code path - similar to Java’s
    classpath – returned by code:get_path/0

    To fix problems exposed by Java’s classpath (especially
    scan time), Erlang can be started in embedded mode

    Directories can be prepended/appended when starting
    Erlang or anytime during the execution

    View Slide

  36. Lambda functions

    Lambda function = anonymous function – therefore, it is generally assigned to a
    variable or passed as argument:
    fun
    (arg1.1, …, arg1.N) [when guard1] - > expr1, …, exprM
    1
    ;
    (...)
    (argK.1, …, argK.N) - > exprK.1, …, exprK.M
    K
    end

    Lambda functions usually have just one clause; should they have more, all the
    heads must have the same arity

    A lambda function can reference bound variables of outer functions, defining a
    closure:
    h(A) ->
    fun (B) -> A * B end.

    View Slide

  37. Functional programming

    Erlang is functional: functions are not just static code blocks –
    they are values

    Therefore, functions can be assigned to variables, passed to and
    returned from other (higher-order) functions

    A lambda function can be assigned to a variable, but a standard
    function can be assigned as well:
    – To assign function f/N from module mymod, just use
    MyFunction = fun mymod:f/N
    – To assign a local function:
    MyFunction = fun f/N

    The lists module include common functional utilities – map, filter,
    flatmap, foldl, all, any, ...

    View Slide

  38. BIFs

    BIFs = built-in functions, mainly belonging to the erlang
    module

    Most of them are auto-imported, so they can be called
    without qualifying them

    BIFs are often seen as part of the language:
    – Conversion functions
    – Basic list (hd/1, tl/1, length/1) and tuple (tuple_size/1)
    manipulation
    – Date/time
    – Process management and communication

    View Slide

  39. Calling a function

    To call a local function or BIF f/N: f(arg1, …, argN)

    To call a function f/N exported by module mymod:
    – mymod:f(arg1, ...argN) → fully-qualified call. In lieu of constants, mymod and f could
    be variables – to provide late binding. If you are used to Python, please note that
    modules do not need to be imported – but their compiled bytecode must be in the
    code search path.
    – Add the module attribute -import(mymod, [f/N]), then call f as if it were local: f(…)

    Of course, fully-qualified calls are allowed when calling local functions, too → in
    this case, the ?MODULE preprocessor constant is handy

    apply(Module, Function, ArgumentsList) is a metaprogramming BIF similar
    to a fully-qualified call – but the arity itself can be unknown at the time of the
    call, if ArgumentsList is passed as a variable

    To call a lambda function, use (arg1, …, argN) after its expression (between
    parentheses) or after any variable bound to it

    View Slide

  40. Records

    Records, similarly to Pascal, provide a structured way to access fields
    by name

    Records are usually declared in include modules, via a dedicated
    attribute:
    -record(recordType,
    {field1 [=defaultValue1], …, fieldN [=defaultValueN]})

    Record fields have no type declaration, like variables

    The value of a record field can even be set to another record instance

    Records are internally translated into tagged tuples, but they provide a
    lot more flexibility should one need to add fields

    View Slide

  41. Instantiating records

    To instantiate a record:
    #recordType{ fieldX = ValueX, …}
    – Missing fields will be assigned their default value –
    or undefined, if it was not declared
    – Unlike tuple items, fields can be assigned in any
    order
    – Usually, the instance is assigned to a variable:
    MyVar = #...

    View Slide

  42. Accessing record values

    To access a single field:
    RecordInstanceVar#recordType.fieldName
    – Again, the expression is usually assigned to a variable

    To access multiple fields, use pattern matching:
    #recordType{fieldX=VarA, fieldY = VarB} = RecordInstance
    – will bind VarA to fieldX of RecordInstance and VarB to fieldY

    Record patterns are supported in other contexts, such as function
    heads or case and receive constructs

    View Slide

  43. Copying records

    Records are immutable – but creating a copy
    of a record having different values for one or
    more fields is very simple and efficient:
    NewRecord =
    RecordInstance#recordType{ fieldX =
    newValueX, fieldY = newValueY, …}

    NewRecord will be equal to RecordInstance,
    except the given fields

    View Slide

  44. Records in the shell

    To read all the records defined in a file, use:
    rr(“fileName”)

    To define a record in the shell, use:
    rd(…)
    which is the shell version of -record(...)

    View Slide

  45. Bit strings

    Bit string = untyped chunk of bits, used for performance
    reasons or when handling low-layer protocols

    Bit notation: << Value1, Value2, …, ValueN >>
    where each value:
    – Can be an integer or a string
    – Can be followed by :size and/or /specifier1[-specifier2-...]

    Empty bit string: <<>>

    Bit strings also support bit string comprehensions,
    similar to list comprehensions.
    Example: << <<(X+1)>> || X <- [1, 2, 3] >> → << 2, 3, 4 >>

    View Slide

  46. Binaries

    Binary = a byte string, that is a bit string
    containing a number of bits evenly divisible by 8

    Any term can be converted to/from its binary
    representation via simple BIFs: term_to_binary/1
    and binary_to_term/1

    With a list of values:
    – term_to_binary/1 returns its binary representation
    – list_to_binary/1 returns a binary whose items are the
    list items

    View Slide

  47. Bit strings and pattern matching

    Bit strings, via bit notation with size declarations and
    specifiers, support very fine-grained pattern-matching

    Example:
    <> = <<2#01010001:8/unsigned>>
    – The original value is 2#01010001, that is 81
    – Higher is bound to the bits 01010, that is 10
    – Lower is bound to the bits 001, that is 1

    Please, refer to Erlang’s documentation for further details

    View Slide

  48. Advanced bit pattern matching

    A striking trait is that size qualifiers can be expressed by
    variables previously bound in the very same pattern – which
    greatly simplifies frame and packet analysis via a single
    pattern.

    Example:
    << A:3, B:A, _/bits >> = << 2#1110101010101:13 >>
    – A is bound to the 3 left-most bits, (111)
    2
    → (7)
    10
    – B is bound to the following 7 bits, as A is its size:
    (0101010)
    2
    → (42)
    10
    – _/bits consumes the remaining bits, ensuring the match

    View Slide

  49. Bitwise operators

    Erlang defines bitwise infix operators:
    – band
    – bor
    – bxor
    – bnot
    – bsl → shift left
    – bsr → shift right

    Example: 2#1011 band 2#0010 == 2#0010

    View Slide

  50. Writing to the console

    The io module provides utilities for writing to the
    console:
    – nl/0: outputs a newline character
    – write/1: outputs the given term – strings are printed as lists
    – format/2: similar to C’s printf, with an Erlang-specific format
    string

    It also provides input utilities:
    – read/1 reads a term from stdin, after showing the given
    prompt → returns {ok, InputTerm} on success

    io includes more functions – have a look at its EDoc

    View Slide

  51. Preprocessor

    EPP = Erlang PreProcessor – fairly similar to C and C++

    To define a compile-time constant, use
    -define(CONSTANT_NAME, value).

    A constant can be injected via ?CONSTANT_NAME in the source code
    – Predefined constants: ?MODULE, ?FILE, ?LINE, …

    -define can also create parametric macros – parametric textual replacements that, unlike
    functions, can be employed in guards

    -ifdef, -ifndef, -else, -endif enable conditional compilation

    Include files, often containing record definitions and constants:
    – Should have .hrl extension
    – Can be included via the attribute -include(“includeFile.hrl”)

    View Slide

  52. Runtime errors

    Defensive programming = trying to foresee and
    catch every single runtime error. It is not frequent in
    Erlang: it is more likely to let a process crash, so
    that a dedicated process will choose what to do

    Common errors: badarith, function_clause,
    case_clause, if_clause, badmatch, undef

    throw/1 raises a throw, with the given atom

    try/catch and the old-fashioned catch can
    intercept runtime errors

    View Slide

  53. Benchmarking

    A useful function is:
    timer:tc(Module, Function, ArgumentsList)

    It calls a function timing its execution, then
    returns a tuple:
    {ExecutionTimeInMicroSeconds,
    FunctionResult}

    View Slide

  54. Part 3
    Parallelism in Erlang

    View Slide

  55. Parallelism in Erlang

    Erlang processes are not operating system
    threads – they are much more lightweight,
    handled by the VM itself

    Every process has a process identifier (pid)
    and a dedicated mailbox

    Processes do not communicate via shared
    memory, but by sending messages to each
    other’s mailbox

    View Slide

  56. Spawning a process
    spawn(Module, Function, ArgumentsList)

    spawns a new process and makes it execute the given
    Function, exported by Module

    spawn/3 always returns a pid (=process identifier): if the given
    function does not exist or can’t be called, the new process will
    crash due to a runtime error, but the spawning process won’t
    know

    ArgumentsList is designed to initialize the new process – for
    example, passing information about the spawning process

    Processes can even be spawned from lambda functions, using
    spawn/1

    View Slide

  57. Process lifetime

    A spawned process runs until:
    – Its function reaches the end → but, very often, the body is
    a tail-recursive function, or functions composing a finite-state
    machine
    – A runtime error occurs during its execution

    Multiple processes can run the very same code at
    once – after all, there are no global structures and
    everything is immutable (well, almost)

    Therefore, code structure and process dynamics
    are related but orthogonal dimensions

    View Slide

  58. Basic process functions

    BIFs:
    – spawn/3, spawn/1: spawns a new process
    – self/0: returns the pid of the running process
    – processes/0: returns the list of pids of the processes
    running on the current VM

    Shell functions:
    – flush(): fetches and prints out all the messages from
    the shell’s mailbox
    – i(): lists all the current processes

    View Slide

  59. Registering processes

    Processes designed to act as global services, thus having longer lifetime,
    can be registered – that is, assigned an alias

    register(Alias, Pid) → Alias is an atom, Pid is a pid – for example,
    returned by spawn/3. If the alias is already registered, a runtime error
    occurs

    Any process can register any (self or other) process

    Usually, processes are registered with the very name of the module
    containing their related code

    registered/0 → list of registered aliases

    whereis/1 → pid of the given alias

    Processes are automatically unregistered on termination

    In the shell, regs() lists the registered processes

    View Slide

  60. Sending messages

    The current process can send a message to any process:
    TargetProcess ! Message
    – TargetProcess is the pid / alias of the target process
    – Message is any Erlang term

    Messages are appended to the mailbox of the target
    process

    The same message can be sent to many processes:
    TargetProcess1 ! … ! TargetProcessN ! Message

    View Slide

  61. Details on message passing

    Messages are guaranteed to arrive in the order they were sent – of
    course, messages from different process may well arrive interleaved

    Message passing is asynchronous:
    – The sender immediately continues execution; for synchronous behavior, one
    must explicitly request an ack
    – If the target is an invalid pid, or the target process does not exist, nothing
    happens
    – Errors only occur when sending a message to an alias (for example, a non-
    registered alias)

    The sender‘s pid is not sent: if you want to pass it, add it to the
    message (after retrieving it via self/0)

    Tagged tuples or even atoms are common messages in simple
    situations

    View Slide

  62. Receiving messages

    A process can check its mailbox for messages:
    receive
    pattern1 [when guard1] -> expr1.1, …;
    (…)
    patternM [when guardM] -> exprM.1, …
    [after Milliseconds|infinity -> timeoutExp1, …]
    end

    The last pattern branch and the after branch must
    not include the trailing “;”

    View Slide

  63. Details on receiving messages

    receive is synchronous; every time, it linearly scans the whole
    mailbox (in arrival order) and tries to sequentially match each
    scanned message with the patterns (in declaration order):
    – If the scanned message matches a pattern – and the related guard is
    missing or true - the message is removed from the mailbox, receive
    stops scanning and the branch expressions are evaluated: the last one
    becomes receive‘s value
    – If no message matches, the process is suspended, waiting for the arrival
    of a matching message
    – If after is declared and no matching message arrives, the process is
    resumed after the given milliseconds and the after branch is evaluated
    – after 0 means that receive won’t block after scanning the mailbox
    – after infinity is the default behavior described – when after is missing

    View Slide

  64. Mailbox housekeeping

    As messages arrive to a process mailbox, they must be fetched by a receive –
    otherwise, they‘ll cause:
    – Growing memory imprint
    – Slower reception time, as receive performs a linear scan

    Generally speaking, it is often a good idea to add a catch-all pattern to the
    main receive constructs, so as to remove unexpected messages: of course, it
    might be equally useful to log them, to trace their origin process and cause

    If you must receive N messages from N different processes in a precise
    order, you must not use a receive having N branches – the simplest correct
    solution consists in having N subsequent 1-branch receive constructs.

    When using timeouts, beware of stale messages due to previous requests (for
    example, to a server process) which timed out: you must find a way to flush
    them – for example, by introducing references and/or timestamps

    View Slide

  65. Functional interfaces

    Most often, in Erlang, access to shared resources is ensured
    by sending request messages to dedicated processes

    However, it is good practice to hide message passing behind a
    functional interface – that is, a module whose functions
    perform the actual requests

    There are several advantages:
    – Clients are unaware of the underlying protocol, which can be
    arbitrarily changed
    – The actual location of the server can be changed as well
    – Location transparency: in case of a remote server, the client will
    only notice longer response times and a higher percentage of failures

    View Slide

  66. Creating functional interfaces

    In lieu of a direct message:
    resource_service ! {request, Params}
    a functional interface provides a function:
    resource_service:request(Params)
    where resource_service is a process alias in the former case and a module in the
    latter – they belong to different namespaces.

    There are 2 types of calls:
    – Synchronous calls: the client expects a reply. The function is blocking, then usually returns
    {ok, Result} or {error, Reason}
    – Asynchronous calls: the client is not interested in the result, so the function immediately
    returns a value, usually ok

    View Slide

  67. Parallelism issues

    Being functional and focused on immutability, Erlang is
    outstandingly more parallelism-oriented than other
    languages

    However, Erlang programs can still suffer from common
    issues:
    – Race conditions
    – Deadlocks
    – Starvation

    What’s more, Erlang still has mutable aspects – for
    example, the process alias registry in a VM

    View Slide

  68. Linking processes

    link(Pid) → creates a bidirectional existential
    link between the current process and the one
    whose pid is Pid

    link expresses a mutual dependency between
    the linked processes

    spawn_link/3 atomically spawns and links

    unlink(Pid) is also available, but far less
    frequent than link

    View Slide

  69. Details on process linking

    By default, if either linked process terminates:
    – On normal termination, nothing happens
    – On abnormal termination, the other process gets an exit signal {‘EXIT’, Pid,
    Reason} and crashes, propagating such signal to its remaining linked processes -
    after replacing Pid with its own pid

    However, if the non-terminating process – or any process in the propagation
    chain - had called process_flag(trap_exit, true), it just receives a message
    in the mailbox: {‘EXIT’, Pid, Reason} – which can be normal or not.

    Consequently, a process can choose to respawn the crashed linked process,
    therefore acting as a supervisor – it is fairly common in Erlang to attach
    spawned processes to fine-grained supervisor trees

    For more info, please also refer to exit/1 and exit/2, as well as
    erlang:monitor/2

    View Slide

  70. Distributed systems

    Erlang node = an executing runtime system

    Alive node = Erlang node having a name – so that it can
    communicate with other nodes

    A name can usually be assigned on startup:
    – Short name: erl -sname → all the nodes are in the same IP
    domain
    – Long name: erl -name → nodes can reside in arbitrary IP
    domains

    node/0 → identifier of the current node, which must be used by
    other nodes for communication

    net_kernel:start/1 and net_kernel:stop/0 can also be used, as well
    as erlang:is_alive/0

    View Slide

  71. Internode communication

    Nodes having short(/long) name can only communicate with
    nodes having short(/long) name

    Furthermore, they must both share the same atom called magic
    cookie, which can be set:
    – When starting the VM: erl -setcookie
    – Programmatically, via erlang:set_cookie/1
    – Writing the atom in the $HOME/.erlang.cookie file

    If none of the above options is chosen, a cookie file containing a
    random cookie is automatically created, so local nodes can
    automatically communicate with no setup

    To test inter-node communication, use net_adm:ping(Node), which
    returns pong on success and pang on failure

    View Slide

  72. Spawning a remote process
    spawn(Node, Module, Function, ArgumentsList)

    Spawns a process on the given Node. spawn_link/4 is also available

    Returns a pid → location transparency: when sending a message to a
    pid, such pid may reside on any node

    On the other hand, sending a message to a registered alias is not
    transparent:
    {Alias, Node} ! Message

    As usual, message passing should be hidden behind functional interfaces

    When spawning a process, or when passing it a lambda function, all the
    referenced modules must be available in the target node’s code path

    View Slide

  73. Node network

    Erlang nodes - having the same name type (short/long) and sharing the same
    cookie - transparently and lazily get connected when one node references the
    other for the first time. Not only: they transitively connect to every node
    connected to each other → scalability issues

    To override such behaviour:
    – erl -connect_all false → prevents transitive connections
    – erl -hidden starts a hidden node – a node to which connections can only be done directly,
    not transitively

    To list the available nodes:
    – nodes() → all the nodes connected to this node, except hidden nodes
    – nodes(connected) → all the nodes connected to this node – including hidden nodes
    – nodes(hidden) → all the hidden nodes connected to this node

    node/1 → node containing the given pid/reference/...

    To monitor a node: monitor_node(Node, EnableMonitoring)

    View Slide

  74. RPC calls

    Calling a function residing on a connected node
    is simple:
    rpc:call(Node, Module, Function,
    Arguments)
    it returns:
    – The function result on success
    – {badrpc, Reason} in case of failure

    View Slide

  75. Hot code loading

    Whenever a different version (identified by the -vsn attribute) of a module is loaded - for
    example, by compiling it with compile:file/1 or the shell’s c/1, or by loading it via
    code:load_file/1:
    1)the new module code is marked current
    2)the module code used until now is marked old
    3)the module code that was marked old is purged, and processes running its instructions are
    terminated

    After the update, processes spawned from a function in the old module code will
    continue referencing local functions of the old module code – unless the calls are fully
    qualified – in such case, the new version of the function is used.

    To enable a process to auto-update to the latest version of its own module, it is
    important to make a fully-qualified tail call in its body: as soon as such call is reached
    (usually, after a receive), the process will call the new version of its body, thus switching
    to the new module code.

    View Slide

  76. OTP middleware

    Erlang includes a library of middleware templates:
    – Library modules provide generic behavior, carefully crafted
    to support different scenarios and exception cases
    – Client code must implement its specific behavior by
    providing callback modules exposing the interface
    required by the generic behavior

    For example, OTP provides:
    – gen_server → generic server in a client/server relation
    – supervisor → supervisor with fine-grained policies

    View Slide

  77. Part 4
    Conclusion

    View Slide

  78. The tip of the iceberg

    Erlang includes ETS and Dets, for in-memory and on-disk
    caching, as well as Mnesia, a distributed, soft real-time
    transactional database system packaged as an OTP application

    UDP and TCP packets can be received as Erlang messages via
    receive

    wxErlang is a GUI tookit based on wxWindows, employing
    Erlang‘s mailbox system to distribute events

    A visual debugger and a visual process monitor are included

    Erlang evolves over time, introducing new constructs! For
    example, it now supports maps, which are somehow similar to
    records, but more lightweight in terms of syntactic sugar

    View Slide

  79. Final considerations

    Erlang is a mature but modern language, used in real-
    world and real-time scenarios; it’s also simple,
    minimalist and elegant, thanks to its functional nature
    and its well-crafted libraries

    Erlang is an ecosystem as well, fostering brilliant and
    didactic ideas - such as lightweight processes,
    message passing, linking and location transparency –
    which have influenced other languages

    Finally, it is open source and supported by a vibrant
    community! ^__^

    View Slide

  80. Further references

    http://erlang.org → Official website

    https://twitter.com/erlang_org →Erlang on Twitter

    http://www.tryerlang.org/ → Hands-on tutorial

    Erlang Programming – book’s website

    Learn You Some Erlang for Great Good!

    Elixir - a functional language built on top of the Erlang VM
    Thanks for your attention! ^__^

    View Slide