Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Designing and Evaluating a Distributed Computing Language Runtime

Designing and Evaluating a Distributed Computing Language Runtime

Erlang User Conference 2016

Christopher Meiklejohn

September 09, 2016
Tweet

More Decks by Christopher Meiklejohn

Other Decks in Research

Transcript

  1. Designing and
    Evaluating a
    Distributed Computing
    Language Runtime
    Christopher Meiklejohn (@cmeik)
    Université catholique de Louvain, Belgium
    1

    View full-size slide

  2. RA
    RB
    1
    set(1)
    3
    2
    set(2)
    set(3)

    View full-size slide

  3. RA
    RB
    1
    set(1)
    3
    2
    set(2)
    set(3)
    ?
    ?

    View full-size slide

  4. Synchronization
    • To enforce an order

    Makes programming easier
    6

    View full-size slide

  5. Synchronization
    • To enforce an order

    Makes programming easier
    • Eliminate accidental nondeterminism

    Prevent race conditions
    6

    View full-size slide

  6. Synchronization
    • To enforce an order

    Makes programming easier
    • Eliminate accidental nondeterminism

    Prevent race conditions
    • Techniques

    Locks, mutexes, semaphores, monitors,
    etc.
    6

    View full-size slide

  7. Difficult Cases
    • “Internet of Things”, 

    Low power, limited memory and
    connectivity
    7

    View full-size slide

  8. Difficult Cases
    • “Internet of Things”, 

    Low power, limited memory and
    connectivity
    • Mobile Gaming

    Offline operation with replicated, shared
    state
    7

    View full-size slide

  9. Weak Synchronization
    • Can we achieve anything without synchronization?

    Not really.
    8

    View full-size slide

  10. Weak Synchronization
    • Can we achieve anything without synchronization?

    Not really.
    • Strong Eventual Consistency (SEC)

    “Replicas that deliver the same updates have equivalent state”
    8

    View full-size slide

  11. Weak Synchronization
    • Can we achieve anything without synchronization?

    Not really.
    • Strong Eventual Consistency (SEC)

    “Replicas that deliver the same updates have equivalent state”
    • Primary requirement

    Eventual replica-to-replica communication
    8

    View full-size slide

  12. Weak Synchronization
    • Can we achieve anything without synchronization?

    Not really.
    • Strong Eventual Consistency (SEC)

    “Replicas that deliver the same updates have equivalent state”
    • Primary requirement

    Eventual replica-to-replica communication
    • Order insensitive! (Commutativity)
    8

    View full-size slide

  13. Weak Synchronization
    • Can we achieve anything without synchronization?

    Not really.
    • Strong Eventual Consistency (SEC)

    “Replicas that deliver the same updates have equivalent state”
    • Primary requirement

    Eventual replica-to-replica communication
    • Order insensitive! (Commutativity)
    • Duplicate insensitive! (Idempotent)
    8

    View full-size slide

  14. RA
    RB
    1
    set(1)
    3
    2
    set(2)
    set(3)

    View full-size slide

  15. RA
    RB
    1
    3
    2
    3
    3
    set(1) set(2)
    set(3)
    max(2,3)
    max(2,3)

    View full-size slide

  16. How can we succeed with
    Strong Eventual
    Consistency?
    13

    View full-size slide

  17. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)
    14

    View full-size slide

  18. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)
    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    14

    View full-size slide

  19. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)
    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    3. Distributed, and fault-tolerant runtime

    (ex. replication, membership, dissemination)
    14

    View full-size slide

  20. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)

    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    3. Distributed, and fault-tolerant runtime

    (ex. replication, membership, dissemination)
    15

    View full-size slide

  21. Convergent Objects

    Conflict-Free 

    Replicated Data Types
    16
    SSS 2011

    View full-size slide

  22. Conflict-Free 

    Replicated Data Types
    • Many types exist with different
    properties

    Sets, counters, registers, flags, maps,
    graphs
    17

    View full-size slide

  23. Conflict-Free 

    Replicated Data Types
    • Many types exist with different
    properties

    Sets, counters, registers, flags, maps,
    graphs
    • Strong Eventual Consistency

    Instances satisfy SEC property per-
    object
    17

    View full-size slide

  24. RA
    RB
    RC
    {1}
    (1, {a}, {})
    add(1)

    View full-size slide

  25. RA
    RB
    RC
    {1}
    (1, {a}, {})
    add(1)
    {1}
    (1, {c}, {})
    add(1)

    View full-size slide

  26. RA
    RB
    RC
    {1}
    (1, {a}, {})
    add(1)
    {1}
    (1, {c}, {})
    add(1)
    {}
    (1, {c}, {c})
    remove(1)

    View full-size slide

  27. RA
    RB
    RC
    {1}
    (1, {a}, {})
    add(1)
    {1}
    (1, {c}, {})
    add(1)
    {}
    (1, {c}, {c})
    remove(1)
    {1}
    {1}
    {1}
    (1, {a, c}, {c})
    (1, {a, c}, {c})
    (1, {a, c}, {c})

    View full-size slide

  28. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)

    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    3. Distributed, and fault-tolerant runtime

    (ex. replication, membership, dissemination)
    23

    View full-size slide

  29. Convergent Programs
    Lattice Processing
    24
    PPDP 2015

    View full-size slide

  30. Lattice Processing (Lasp)
    • Distributed dataflow

    Declarative, functional programming
    model
    25

    View full-size slide

  31. Lattice Processing (Lasp)
    • Distributed dataflow

    Declarative, functional programming
    model
    • Convergent data structures

    Primary data abstraction is the CRDT
    25

    View full-size slide

  32. Lattice Processing (Lasp)
    • Distributed dataflow

    Declarative, functional programming
    model
    • Convergent data structures

    Primary data abstraction is the CRDT
    • Enables composition

    Provides functional composition of CRDTs
    that preserves the SEC property
    25

    View full-size slide

  33. 26
    %% Create initial set.
    S1 = declare(set),
    %% Add elements to initial set and update.
    update(S1, {add, [1,2,3]}),
    %% Create second set.
    S2 = declare(set),
    %% Apply map operation between S1 and S2.
    map(S1, fun(X) -> X * 2 end, S2).

    View full-size slide

  34. 27
    %% Create initial set.
    S1 = declare(set),
    %% Add elements to initial set and update.
    update(S1, {add, [1,2,3]}),
    %% Create second set.
    S2 = declare(set),
    %% Apply map operation between S1 and S2.
    map(S1, fun(X) -> X * 2 end, S2).

    View full-size slide

  35. 28
    %% Create initial set.
    S1 = declare(set),
    %% Add elements to initial set and update.
    update(S1, {add, [1,2,3]}),
    %% Create second set.
    S2 = declare(set),
    %% Apply map operation between S1 and S2.
    map(S1, fun(X) -> X * 2 end, S2).

    View full-size slide

  36. 29
    %% Create initial set.
    S1 = declare(set),
    %% Add elements to initial set and update.
    update(S1, {add, [1,2,3]}),
    %% Create second set.
    S2 = declare(set),
    %% Apply map operation between S1 and S2.
    map(S1, fun(X) -> X * 2 end, S2).

    View full-size slide

  37. 30
    %% Create initial set.
    S1 = declare(set),
    %% Add elements to initial set and update.
    update(S1, {add, [1,2,3]}),
    %% Create second set.
    S2 = declare(set),
    %% Apply map operation between S1 and S2.
    map(S1, fun(X) -> X * 2 end, S2).

    View full-size slide

  38. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)

    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    3. Distributed, and fault-tolerant runtime

    (ex. replication, membership, dissemination)
    31

    View full-size slide

  39. Distributed Runtime
    Selective Hearing
    32
    W-PSDS 2015

    View full-size slide

  40. Selective Hearing
    • Epidemic broadcast based runtime system

    Provide a runtime system that can scale to large numbers of
    nodes, that is resilient to failures and provides efficient execution
    33

    View full-size slide

  41. Selective Hearing
    • Epidemic broadcast based runtime system

    Provide a runtime system that can scale to large numbers of
    nodes, that is resilient to failures and provides efficient execution
    • Well-matched to Lattice Processing (Lasp)
    33

    View full-size slide

  42. Selective Hearing
    • Epidemic broadcast based runtime system

    Provide a runtime system that can scale to large numbers of
    nodes, that is resilient to failures and provides efficient execution
    • Well-matched to Lattice Processing (Lasp)
    • Epidemic broadcast mechanisms provide weak ordering but
    are resilient and efficient
    33

    View full-size slide

  43. Selective Hearing
    • Epidemic broadcast based runtime system

    Provide a runtime system that can scale to large numbers of
    nodes, that is resilient to failures and provides efficient execution
    • Well-matched to Lattice Processing (Lasp)
    • Epidemic broadcast mechanisms provide weak ordering but
    are resilient and efficient
    • Lasp’s programming model is tolerant to message re-
    ordering, disconnections, and node failures
    33

    View full-size slide

  44. Selective Hearing
    • Epidemic broadcast based runtime system

    Provide a runtime system that can scale to large numbers of
    nodes, that is resilient to failures and provides efficient execution
    • Well-matched to Lattice Processing (Lasp)
    • Epidemic broadcast mechanisms provide weak ordering but
    are resilient and efficient
    • Lasp’s programming model is tolerant to message re-
    ordering, disconnections, and node failures
    • “Selective Receive”

    Nodes selectively receive and process messages based on
    interest.
    33

    View full-size slide

  45. Layered Approach
    34

    View full-size slide

  46. Layered Approach
    • Membership

    Configurable membership protocol which can operate
    in a client-server or peer-to-peer mode
    34

    View full-size slide

  47. Layered Approach
    • Membership

    Configurable membership protocol which can operate
    in a client-server or peer-to-peer mode
    • Broadcast (via Gossip, Tree, etc.)

    Efficient dissemination of both program state and
    application state via gossip, broadcast tree, or hybrid
    mode
    34

    View full-size slide

  48. Layered Approach
    • Membership

    Configurable membership protocol which can operate
    in a client-server or peer-to-peer mode
    • Broadcast (via Gossip, Tree, etc.)

    Efficient dissemination of both program state and
    application state via gossip, broadcast tree, or hybrid
    mode
    • Auto-discovery

    Integration with Mesos, auto-discovery of Lasp nodes
    for ease of configurability
    34

    View full-size slide

  49. Membership Overlay

    View full-size slide

  50. Membership Overlay
    Broadcast Overlay

    View full-size slide

  51. Membership Overlay
    Broadcast Overlay

    View full-size slide

  52. Membership Overlay
    Broadcast Overlay
    Mobile Phone

    View full-size slide

  53. Membership Overlay
    Broadcast Overlay
    Mobile Phone
    Distributed Hash Table

    View full-size slide

  54. Membership Overlay
    Broadcast Overlay
    Mobile Phone
    Distributed Hash Table
    Lasp Execution

    View full-size slide

  55. Programming SEC
    1. Eliminate accidental nondeterminism

    (ex. deterministic, modeling non-monotonic operations
    monotonically)

    2. Retain the properties of functional
    programming

    (ex. confluence, referential transparency over composition)
    3. Distributed, and fault-tolerant runtime

    (ex. replication, membership, dissemination)
    41

    View full-size slide

  56. What can we build?
    Advertisement Counter
    42

    View full-size slide

  57. Advertisement Counter
    • Mobile game platform selling
    advertisement space

    Advertisements are paid according to a
    minimum number of impressions
    43

    View full-size slide

  58. Advertisement Counter
    • Mobile game platform selling
    advertisement space

    Advertisements are paid according to a
    minimum number of impressions
    • Clients will go offline

    Clients have limited connectivity and the
    system still needs to make progress
    while clients are offline
    43

    View full-size slide

  59. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client
    44

    View full-size slide

  60. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Riot Ads
    Rovio
    Ads
    Product
    Read
    50,000
    Remove
    Increment
    Union
    45
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  61. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Client
    46
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  62. Ads
    ovio Ad
    ounter 1
    ovio Ad
    ounter 2
    Riot Ad
    ounter 1
    Riot Ad
    ounter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Rovio Ad
    Counter
    1
    Ro
    C
    Rovio Ad
    Counter
    1
    Ro
    C
    Rovio Ad
    Counter
    1
    Ro
    C
    Rovio Ad
    Counter
    1
    Ro
    C
    Client Side, Sing
    47
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  63. Ads
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    move Read
    Union
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client
    48
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  64. Ads
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Filter
    Product
    Read
    Union
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client
    49
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  65. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client
    50
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  66. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Riot Ads
    Rovio
    Ads
    Fil
    Product
    Read
    50,000
    Remove
    Increment
    Union
    51
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  67. Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client
    52
    Ads
    Rovio Ad
    Counter 1
    Rovio Ad
    Counter 2
    Riot Ad
    Counter 1
    Riot Ad
    Counter 2
    Contracts
    Ads
    Contracts
    Ads
    With
    Contracts
    Riot Ads
    Rovio
    Ads
    Filter
    Product
    Read
    50,000
    Remove
    Increment
    Read
    Union
    Lasp Operation
    User-Maintained CRDT
    Lasp-Maintained CRDT
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Rovio Ad
    Counter
    1
    Rovio Ad
    Counter
    2
    Riot Ad
    Counter
    1
    Client Side, Single Copy at Client

    View full-size slide

  68. Evaluation
    Initial Evaluation
    53

    View full-size slide

  69. Background
    Distributed Erlang
    • Transparent distribution

    Built-in, provided by Erlang/BEAM, cross-node message
    passing.
    54

    View full-size slide

  70. Background
    Distributed Erlang
    • Transparent distribution

    Built-in, provided by Erlang/BEAM, cross-node message
    passing.
    • Known scalability limitations

    Analyzed in academic in various publications.
    54

    View full-size slide

  71. Background
    Distributed Erlang
    • Transparent distribution

    Built-in, provided by Erlang/BEAM, cross-node message
    passing.
    • Known scalability limitations

    Analyzed in academic in various publications.
    • Single connection

    Head of line blocking.
    54

    View full-size slide

  72. Background
    Distributed Erlang
    • Transparent distribution

    Built-in, provided by Erlang/BEAM, cross-node message
    passing.
    • Known scalability limitations

    Analyzed in academic in various publications.
    • Single connection

    Head of line blocking.
    • Full membership

    All-to-all failure detection with heartbeats and
    timeouts.
    54

    View full-size slide

  73. Background
    Erlang Port Mapper Daemon
    • Operates on a known port

    Similar to Solaris sunrpc style portmap:
    known port for mapping to dynamic
    port-based services.
    55

    View full-size slide

  74. Background
    Erlang Port Mapper Daemon
    • Operates on a known port

    Similar to Solaris sunrpc style portmap:
    known port for mapping to dynamic
    port-based services.
    • Bridged networking

    Problematic for cluster in bridged
    networking with dynamic port allocation.
    55

    View full-size slide

  75. Experiment Design
    • Single application

    Advertisement counter example from Rovio Entertainment.
    56

    View full-size slide

  76. Experiment Design
    • Single application

    Advertisement counter example from Rovio Entertainment.
    • Runtime configuration

    Application controlled through runtime environment
    variables.
    56

    View full-size slide

  77. Experiment Design
    • Single application

    Advertisement counter example from Rovio Entertainment.
    • Runtime configuration

    Application controlled through runtime environment
    variables.
    • Membership

    Full membership with Distributed Erlang via EPMD.
    56

    View full-size slide

  78. Experiment Design
    • Single application

    Advertisement counter example from Rovio Entertainment.
    • Runtime configuration

    Application controlled through runtime environment
    variables.
    • Membership

    Full membership with Distributed Erlang via EPMD.
    • Dissemination

    State-based object dissemination through anti-entropy
    protocol (fanout-based, PARC-style.)
    56

    View full-size slide

  79. Experiment Orchestration
    • Docker and Mesos with Marathon

    Used for deployment of both EPMD and Lasp application.
    57

    View full-size slide

  80. Experiment Orchestration
    • Docker and Mesos with Marathon

    Used for deployment of both EPMD and Lasp application.
    • Single EPMD instance per slave

    Controlled through the use of host networking and
    HOSTNAME: UNIQUE constraints in Mesos.
    57

    View full-size slide

  81. Experiment Orchestration
    • Docker and Mesos with Marathon

    Used for deployment of both EPMD and Lasp application.
    • Single EPMD instance per slave

    Controlled through the use of host networking and
    HOSTNAME: UNIQUE constraints in Mesos.
    • Lasp

    Local execution using host networking: connects to local
    EPMD.
    57

    View full-size slide

  82. Experiment Orchestration
    • Docker and Mesos with Marathon

    Used for deployment of both EPMD and Lasp application.
    • Single EPMD instance per slave

    Controlled through the use of host networking and
    HOSTNAME: UNIQUE constraints in Mesos.
    • Lasp

    Local execution using host networking: connects to local
    EPMD.
    • Service Discovery

    Service discovery facilitated through clustering EPMD
    instances through Sprinter.
    57

    View full-size slide

  83. Ideal Experiment
    • Local Deployment

    High thread concurrency when
    operating with lower node count.
    58

    View full-size slide

  84. Ideal Experiment
    • Local Deployment

    High thread concurrency when
    operating with lower node count.
    • Cloud Deployment

    Low thread concurrency when operating
    with a higher node count.
    58

    View full-size slide

  85. Results
    Initial Evaluation
    59

    View full-size slide

  86. Initial Evaluation
    • Moved to DC/OS exclusively

    Environments too different: too much work needed to be adapted
    for things to work correctly.
    60

    View full-size slide

  87. Initial Evaluation
    • Moved to DC/OS exclusively

    Environments too different: too much work needed to be adapted
    for things to work correctly.
    • Single orchestration task

    Dispatched events, controlled when to start and stop the
    evaluation and performed log aggregation.
    60

    View full-size slide

  88. Initial Evaluation
    • Moved to DC/OS exclusively

    Environments too different: too much work needed to be adapted
    for things to work correctly.
    • Single orchestration task

    Dispatched events, controlled when to start and stop the
    evaluation and performed log aggregation.
    • Bottleneck

    Events immediately dispatched: would require blocking for
    processing acknowledgment.
    60

    View full-size slide

  89. Initial Evaluation
    • Moved to DC/OS exclusively

    Environments too different: too much work needed to be adapted
    for things to work correctly.
    • Single orchestration task

    Dispatched events, controlled when to start and stop the
    evaluation and performed log aggregation.
    • Bottleneck

    Events immediately dispatched: would require blocking for
    processing acknowledgment.
    • Unrealistic

    Events do not queue up all at once for processing by the
    client.
    60

    View full-size slide

  90. Lasp Difficulties
    • Too expensive

    2.0 CPU and 2048 MiB of memory.
    61

    View full-size slide

  91. Lasp Difficulties
    • Too expensive

    2.0 CPU and 2048 MiB of memory.
    • Weeks spent adding instrumentation

    Process level, VM level, Erlang Observer instrumentation to
    identify heavy CPU and memory processes.
    61

    View full-size slide

  92. Lasp Difficulties
    • Too expensive

    2.0 CPU and 2048 MiB of memory.
    • Weeks spent adding instrumentation

    Process level, VM level, Erlang Observer instrumentation to
    identify heavy CPU and memory processes.
    • Dissemination too expensive

    1000 threads to a single dissemination process (one Mesos
    task) leads to backed up message queues and memory leaks.
    61

    View full-size slide

  93. Lasp Difficulties
    • Too expensive

    2.0 CPU and 2048 MiB of memory.
    • Weeks spent adding instrumentation

    Process level, VM level, Erlang Observer instrumentation to
    identify heavy CPU and memory processes.
    • Dissemination too expensive

    1000 threads to a single dissemination process (one Mesos
    task) leads to backed up message queues and memory leaks.
    • Unrealistic

    Two different dissemination mechanisms: thread to thread and
    node to node: one is synthetic.
    61

    View full-size slide

  94. EPMD Difficulties
    • Nodes become unregistered

    Nodes randomly unregistered with EPMD during
    execution.
    62

    View full-size slide

  95. EPMD Difficulties
    • Nodes become unregistered

    Nodes randomly unregistered with EPMD during
    execution.
    • Lost connection

    EPMD loses connections with nodes for some
    arbitrary reason.
    62

    View full-size slide

  96. EPMD Difficulties
    • Nodes become unregistered

    Nodes randomly unregistered with EPMD during
    execution.
    • Lost connection

    EPMD loses connections with nodes for some
    arbitrary reason.
    • EPMD task restarted by Mesos

    Restarted for an unknown reason, which leads
    Lasp instances to restart in their own container.
    62

    View full-size slide

  97. Overhead Difficulties
    • Too much state

    Client would ship around 5 GiB of state within 90
    seconds.
    63

    View full-size slide

  98. Overhead Difficulties
    • Too much state

    Client would ship around 5 GiB of state within 90
    seconds.
    • Delta dissemination

    Delta dissemination only provides around a 30%
    decrease in state transmission.
    63

    View full-size slide

  99. Overhead Difficulties
    • Too much state

    Client would ship around 5 GiB of state within 90
    seconds.
    • Delta dissemination

    Delta dissemination only provides around a 30%
    decrease in state transmission.
    • Unbounded queues

    Message buffers would lead to VMs crashing
    because of large memory consumption.
    63

    View full-size slide

  100. Evaluation
    Rearchitecture
    64

    View full-size slide

  101. Ditch Distributed Erlang
    • Pluggable membership service

    Build pluggable membership service with abstract
    interface initially on EPMD and later migrate after tested.
    65

    View full-size slide

  102. Ditch Distributed Erlang
    • Pluggable membership service

    Build pluggable membership service with abstract
    interface initially on EPMD and later migrate after tested.
    • Adapt Lasp and Broadcast layer

    Integrate pluggable membership service throughout the
    stack and librate existing libraries from distributed
    Erlang.
    65

    View full-size slide

  103. Ditch Distributed Erlang
    • Pluggable membership service

    Build pluggable membership service with abstract
    interface initially on EPMD and later migrate after tested.
    • Adapt Lasp and Broadcast layer

    Integrate pluggable membership service throughout the
    stack and librate existing libraries from distributed
    Erlang.
    • Build service discovery mechanism

    Mechanize node discovery outside of EPMD based on
    new membership service.
    65

    View full-size slide

  104. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    66

    View full-size slide

  105. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    66

    View full-size slide

  106. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    • Full membership via EPMD.
    66

    View full-size slide

  107. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    • Full membership via EPMD.
    • Full membership via TCP.
    66

    View full-size slide

  108. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    • Full membership via EPMD.
    • Full membership via TCP.
    • Client-server membership via TCP.
    66

    View full-size slide

  109. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    • Full membership via EPMD.
    • Full membership via TCP.
    • Client-server membership via TCP.
    • Peer-to-peer membership via TCP (with HyParView)
    66

    View full-size slide

  110. Partisan
    (Membership Layer)
    • Pluggable protocol membership layer

    Allow runtime configuration of protocols used for cluster membership.
    • Several protocol implementations:
    • Full membership via EPMD.
    • Full membership via TCP.
    • Client-server membership via TCP.
    • Peer-to-peer membership via TCP (with HyParView)
    • Visualization

    Provide a force-directed graph-based visualization engine for cluster
    debugging in real-time.
    66

    View full-size slide

  111. Partisan
    (Full via EPMD or TCP)
    • Full membership

    Nodes have full visibility into the entire graph.
    67

    View full-size slide

  112. Partisan
    (Full via EPMD or TCP)
    • Full membership

    Nodes have full visibility into the entire graph.
    • Failure detection

    Performed by peer-to-peer heartbeat messages with a
    timeout.
    67

    View full-size slide

  113. Partisan
    (Full via EPMD or TCP)
    • Full membership

    Nodes have full visibility into the entire graph.
    • Failure detection

    Performed by peer-to-peer heartbeat messages with a
    timeout.
    • Limited scalability

    Heartbeat interval increases when node count increases
    leading to false or delayed detection.
    67

    View full-size slide

  114. Partisan
    (Full via EPMD or TCP)
    • Full membership

    Nodes have full visibility into the entire graph.
    • Failure detection

    Performed by peer-to-peer heartbeat messages with a
    timeout.
    • Limited scalability

    Heartbeat interval increases when node count increases
    leading to false or delayed detection.
    • Testing

    Used to create the initial test suite for Partisan.
    67

    View full-size slide

  115. Partisan
    (Client-Server Model)
    • Client-server membership

    Server has all peers in the system as peers; client has
    only the server as a peer.
    68

    View full-size slide

  116. Partisan
    (Client-Server Model)
    • Client-server membership

    Server has all peers in the system as peers; client has
    only the server as a peer.
    • Failure detection

    Nodes heartbeat with timeout all peers they are aware of.
    68

    View full-size slide

  117. Partisan
    (Client-Server Model)
    • Client-server membership

    Server has all peers in the system as peers; client has
    only the server as a peer.
    • Failure detection

    Nodes heartbeat with timeout all peers they are aware of.
    • Limited scalability

    Single point of failure: server; with limited scalability on
    visibility.
    68

    View full-size slide

  118. Partisan
    (Client-Server Model)
    • Client-server membership

    Server has all peers in the system as peers; client has
    only the server as a peer.
    • Failure detection

    Nodes heartbeat with timeout all peers they are aware of.
    • Limited scalability

    Single point of failure: server; with limited scalability on
    visibility.
    • Testing

    Used for baseline evaluations as “reference” architecture.
    68

    View full-size slide

  119. Partisan
    (HyParView, default)
    • Partial view protocol

    Two views: active (fixed) and passive (log n); passive
    used for failure replacement with active view.
    69

    View full-size slide

  120. Partisan
    (HyParView, default)
    • Partial view protocol

    Two views: active (fixed) and passive (log n); passive
    used for failure replacement with active view.
    • Failure detection

    Performed by monitoring active TCP connections to
    peers with keep-alive enabled.
    69

    View full-size slide

  121. Partisan
    (HyParView, default)
    • Partial view protocol

    Two views: active (fixed) and passive (log n); passive
    used for failure replacement with active view.
    • Failure detection

    Performed by monitoring active TCP connections to
    peers with keep-alive enabled.
    • Very scalable (10k+ nodes during academic
    evaluation)

    However, probabilistic; potentially leads to isolated
    nodes during churn.
    69

    View full-size slide

  122. Sprinter
    (Service Discovery)
    • Responsible for clustering tasks

    Uses Partisan to cluster all nodes and ensure connected
    overlay network: reads information from Marathon.
    70

    View full-size slide

  123. Sprinter
    (Service Discovery)
    • Responsible for clustering tasks

    Uses Partisan to cluster all nodes and ensure connected
    overlay network: reads information from Marathon.
    • Node local

    Operates at each node and is responsible for taking
    actions to ensure connected graph: required for
    probabilistic protocols.
    70

    View full-size slide

  124. Sprinter
    (Service Discovery)
    • Responsible for clustering tasks

    Uses Partisan to cluster all nodes and ensure connected
    overlay network: reads information from Marathon.
    • Node local

    Operates at each node and is responsible for taking
    actions to ensure connected graph: required for
    probabilistic protocols.
    • Membership mode specific

    Knows, based on the membership mode, how to properly
    cluster nodes and enforces proper join behaviour.
    70

    View full-size slide

  125. Debugging Sprinter
    • S3 archival

    Nodes periodically snapshot their membership view for analysis.
    71

    View full-size slide

  126. Debugging Sprinter
    • S3 archival

    Nodes periodically snapshot their membership view for analysis.
    • Elected node (or group) analyses 

    Periodically analyses the information in S3 for the following:
    71

    View full-size slide

  127. Debugging Sprinter
    • S3 archival

    Nodes periodically snapshot their membership view for analysis.
    • Elected node (or group) analyses 

    Periodically analyses the information in S3 for the following:
    • Isolated node detection

    Identifies isolated nodes and takes corrective measures to repair the
    overlay.
    71

    View full-size slide

  128. Debugging Sprinter
    • S3 archival

    Nodes periodically snapshot their membership view for analysis.
    • Elected node (or group) analyses 

    Periodically analyses the information in S3 for the following:
    • Isolated node detection

    Identifies isolated nodes and takes corrective measures to repair the
    overlay.
    • Verifies symmetric relationship

    Ensures that if a node knows about another node, the relationship is
    symmetric: prevents I know you, but you don’t know me.
    71

    View full-size slide

  129. Debugging Sprinter
    • S3 archival

    Nodes periodically snapshot their membership view for analysis.
    • Elected node (or group) analyses 

    Periodically analyses the information in S3 for the following:
    • Isolated node detection

    Identifies isolated nodes and takes corrective measures to repair the
    overlay.
    • Verifies symmetric relationship

    Ensures that if a node knows about another node, the relationship is
    symmetric: prevents I know you, but you don’t know me.
    • Periodic alerting

    Alerts regarding disconnected graphs so external measures can be
    taken, if necessary.
    71

    View full-size slide

  130. Evaluation
    Next Evaluation
    72

    View full-size slide

  131. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    73

    View full-size slide

  132. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    • Each simulation:
    73

    View full-size slide

  133. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    • Each simulation:
    • Different application scenario

    Uniquely execute a different application scenario at runtime based on
    runtime configuration.
    73

    View full-size slide

  134. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    • Each simulation:
    • Different application scenario

    Uniquely execute a different application scenario at runtime based on
    runtime configuration.
    • Result aggregation

    Aggregate results at end of execution and archive these results.
    73

    View full-size slide

  135. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    • Each simulation:
    • Different application scenario

    Uniquely execute a different application scenario at runtime based on
    runtime configuration.
    • Result aggregation

    Aggregate results at end of execution and archive these results.
    • Plot generation

    Automatically generate plots for the execution and aggregate the results of
    multiple executions.
    73

    View full-size slide

  136. Evaluation Strategy
    • Deployment and runtime configuration

    Ability to deploy a cluster of node and configure simulations at runtime.
    • Each simulation:
    • Different application scenario

    Uniquely execute a different application scenario at runtime based on
    runtime configuration.
    • Result aggregation

    Aggregate results at end of execution and archive these results.
    • Plot generation

    Automatically generate plots for the execution and aggregate the results of
    multiple executions.
    • Minimal coordination 

    Work must be performed with minimal coordination, as a single orchestrator is a
    scalability bottleneck for large applications.
    73

    View full-size slide

  137. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    74

    View full-size slide

  138. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    • Simulates a workflow

    Nodes use this operation to simulate a lock-stop workflow for the experiment.
    74

    View full-size slide

  139. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    • Simulates a workflow

    Nodes use this operation to simulate a lock-stop workflow for the experiment.
    • Event Generation

    Event generation toggles a boolean for the node to show completion.
    74

    View full-size slide

  140. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    • Simulates a workflow

    Nodes use this operation to simulate a lock-stop workflow for the experiment.
    • Event Generation

    Event generation toggles a boolean for the node to show completion.
    • Log Aggregation

    Completion triggers log aggregation.
    74

    View full-size slide

  141. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    • Simulates a workflow

    Nodes use this operation to simulate a lock-stop workflow for the experiment.
    • Event Generation

    Event generation toggles a boolean for the node to show completion.
    • Log Aggregation

    Completion triggers log aggregation.
    • Shutdown

    Upon log aggregation completion, nodes shutdown.
    74

    View full-size slide

  142. Completion Detection
    • “Convergence Structure”

    Uninstrumented CRDT of grow-only sets containing counters that each node
    manipulates.
    • Simulates a workflow

    Nodes use this operation to simulate a lock-stop workflow for the experiment.
    • Event Generation

    Event generation toggles a boolean for the node to show completion.
    • Log Aggregation

    Completion triggers log aggregation.
    • Shutdown

    Upon log aggregation completion, nodes shutdown.
    • External monitoring

    When events complete execution, nodes automatically begin the next
    experiment.
    74

    View full-size slide

  143. Results
    Next Evaluation
    75

    View full-size slide

  144. Results Lasp
    • Single node orchestration: bad

    Not possible once you exceed a few nodes:
    message queues, memory, delays.
    76

    View full-size slide

  145. Results Lasp
    • Single node orchestration: bad

    Not possible once you exceed a few nodes:
    message queues, memory, delays.
    • Partial Views

    Required: rely on transitive dissemination of
    information and partial network knowledge.
    76

    View full-size slide

  146. Results Lasp
    • Single node orchestration: bad

    Not possible once you exceed a few nodes:
    message queues, memory, delays.
    • Partial Views

    Required: rely on transitive dissemination of
    information and partial network knowledge.
    • Results

    Reduced Lasp memory footprint to 75MB; larger
    in practice for debugging.
    76

    View full-size slide

  147. Results Partisan
    • Fast churn isolates nodes

    Need a repair mechanism: random promotion of isolated
    nodes; mainly issues of symmetry.
    77

    View full-size slide

  148. Results Partisan
    • Fast churn isolates nodes

    Need a repair mechanism: random promotion of isolated
    nodes; mainly issues of symmetry.
    • FIFO across connections

    Not per connection, but protocol assumes across all
    connections leading to false disconnects.
    77

    View full-size slide

  149. Results Partisan
    • Fast churn isolates nodes

    Need a repair mechanism: random promotion of isolated
    nodes; mainly issues of symmetry.
    • FIFO across connections

    Not per connection, but protocol assumes across all
    connections leading to false disconnects.
    • Unrealistic system model

    You need per message acknowledgements for safety.
    77

    View full-size slide

  150. Results Partisan
    • Fast churn isolates nodes

    Need a repair mechanism: random promotion of isolated
    nodes; mainly issues of symmetry.
    • FIFO across connections

    Not per connection, but protocol assumes across all
    connections leading to false disconnects.
    • Unrealistic system model

    You need per message acknowledgements for safety.
    • Pluggable protocol helps debugging

    Being able to switch to full membership or client-server assists
    in debugging protocol vs. application problems.
    77

    View full-size slide

  151. Latest Results
    • Reproducibility at 300 nodes for full applications

    Connectivity, but transient partitions and isolated
    nodes at 500 - 1000 nodes (across 140 instances.)
    78

    View full-size slide

  152. Latest Results
    • Reproducibility at 300 nodes for full applications

    Connectivity, but transient partitions and isolated
    nodes at 500 - 1000 nodes (across 140 instances.)
    • Limited financially and by Amazon

    Harder to run larger evaluations because we’re
    limited financially (as a university) and because of
    Amazon limits.
    78

    View full-size slide

  153. Latest Results
    • Reproducibility at 300 nodes for full applications

    Connectivity, but transient partitions and isolated
    nodes at 500 - 1000 nodes (across 140 instances.)
    • Limited financially and by Amazon

    Harder to run larger evaluations because we’re
    limited financially (as a university) and because of
    Amazon limits.
    • Mean state reduction per client

    Around 100x improvement from our PaPoC 2016
    initial evaluation results.
    78

    View full-size slide

  154. Plat à emporter
    • Visualizations are important!

    Graph performance, visualize your cluster: all of these things lead to
    easier debugging.
    79

    View full-size slide

  155. Plat à emporter
    • Visualizations are important!

    Graph performance, visualize your cluster: all of these things lead to
    easier debugging.
    • Control changes

    No Lasp PR accepted without divergence, state transmission, and
    overhead graphs.
    79

    View full-size slide

  156. Plat à emporter
    • Visualizations are important!

    Graph performance, visualize your cluster: all of these things lead to
    easier debugging.
    • Control changes

    No Lasp PR accepted without divergence, state transmission, and
    overhead graphs.
    • Automation

    Developers use graphs when they are easy to make: lower the
    difficulty for generation and understand how changes alter system
    behaviour.
    79

    View full-size slide

  157. Plat à emporter
    • Visualizations are important!

    Graph performance, visualize your cluster: all of these things lead to
    easier debugging.
    • Control changes

    No Lasp PR accepted without divergence, state transmission, and
    overhead graphs.
    • Automation

    Developers use graphs when they are easy to make: lower the
    difficulty for generation and understand how changes alter system
    behaviour.
    • Make work easily testable

    When you test locally and deploy globally, you need to make things
    easy to test, deploy and evaluate (for good science, I say!)
    79

    View full-size slide

  158. 80
    Christopher Meiklejohn

    @cmeik
    http://www.lasp-lang.org
    http://github.com/lasp-lang
    Thanks!

    View full-size slide