Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robust, Large-scale Concurrent and Distributed Programming

Philipp Haller
August 17, 2018
260

Robust, Large-scale Concurrent and Distributed Programming

Philipp Haller

August 17, 2018
Tweet

Transcript

  1. Philipp Haller
    Philipp Haller
    1
    Docent Lecture
    Robust, Large-scale Concurrent and
    Distributed Programming
    KTH Royal Institute of Technology
    Stockholm, Sweden
    August 17th, 2018

    View Slide

  2. Philipp Haller
    Motivation
    2

    View Slide

  3. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    2

    View Slide

  4. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    2

    View Slide

  5. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    • Reacting at the speed of the environment (guaranteed timely responses)
    – Example: autonomous driving
    • High availability
    • Fault tolerance
    2

    View Slide

  6. Philipp Haller 3

    View Slide

  7. Philipp Haller 3
    Steam delivers 16.9 PB per week to
    users in Germany (USA: 46.9 PB) [2]

    View Slide

  8. Philipp Haller
    Motivation
    4

    View Slide

  9. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    4
    February 2018

    View Slide

  10. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    4
    February 2018
    Q4, 2017

    View Slide

  11. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    • Reacting at the speed of the environment (guaranteed timely responses)
    4
    February 2018
    Q4, 2017

    View Slide

  12. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    • Reacting at the speed of the environment (guaranteed timely responses)
    – Example: autonomous driving
    4
    February 2018
    Q4, 2017

    View Slide

  13. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    • Reacting at the speed of the environment (guaranteed timely responses)
    – Example: autonomous driving
    • High availability
    4
    February 2018
    Q4, 2017

    View Slide

  14. Philipp Haller
    Motivation
    Demands of new and emerging software applications:
    • Rapidly increasing scale of workloads:
    – CERN amassed about 200 PB of data from over 800 trillion collisions searching for the
    Higgs boson. [1]
    – Steam, a digital content distribution service, delivers 16.9 PB per week to users in
    Germany (USA: 46.9 PB) [2]
    – Twitter has about 330 million monthly active users [3]
    • Reacting at the speed of the environment (guaranteed timely responses)
    – Example: autonomous driving
    • High availability
    • Fault tolerance
    4
    February 2018
    Q4, 2017

    View Slide

  15. Philipp Haller
    Distributed Programming: A Solution
    5

    View Slide

  16. Philipp Haller
    Distributed Programming: A Solution
    • Enables construcing systems that are:
    5

    View Slide

  17. Philipp Haller
    Distributed Programming: A Solution
    • Enables construcing systems that are:
    – physically distributed, e.g. Internet of Things
    5

    View Slide

  18. Philipp Haller
    Distributed Programming: A Solution
    • Enables construcing systems that are:
    – physically distributed, e.g. Internet of Things
    – fault-tolerant
    5

    View Slide

  19. Philipp Haller
    Distributed Programming: A Solution
    • Enables construcing systems that are:
    – physically distributed, e.g. Internet of Things
    – fault-tolerant
    – highly available
    5

    View Slide

  20. Philipp Haller
    Distributed Programming: A Solution
    • Enables construcing systems that are:
    – physically distributed, e.g. Internet of Things
    – fault-tolerant
    – highly available
    – elastic (subsumes scalable)
    5

    View Slide

  21. Philipp Haller
    Distributed Programming: A Challenge
    6

    View Slide

  22. Philipp Haller
    Distributed Programming: A Challenge
    • Programmers must master the complex interplay of:
    6

    View Slide

  23. Philipp Haller
    Distributed Programming: A Challenge
    • Programmers must master the complex interplay of:
    – concurrency of computations
    6

    View Slide

  24. Philipp Haller
    Distributed Programming: A Challenge
    • Programmers must master the complex interplay of:
    – concurrency of computations
    – asynchronicity of events
    6

    View Slide

  25. Philipp Haller
    Distributed Programming: A Challenge
    • Programmers must master the complex interplay of:
    – concurrency of computations
    – asynchronicity of events
    – failure of communication and/or systems
    6

    View Slide

  26. Philipp Haller
    Distributed Programming: A Challenge
    • Programmers must master the complex interplay of:
    – concurrency of computations
    – asynchronicity of events
    – failure of communication and/or systems
    • An extreme challenge even for expert programmers
    6

    View Slide

  27. Philipp Haller
    Overview
    • Motivation
    • Part 1: Type systems for data-race safe concurrency
    • Part 2: Practical deterministic concurrency
    • Part 3: Lineage-based distributed programming
    • Ongoing and future work
    • Conclusion
    7

    View Slide

  28. Philipp Haller
    Overview
    • Motivation
    • Part 1: Type systems for data-race safe concurrency
    • Part 2: Practical deterministic concurrency
    • Part 3: Lineage-based distributed programming
    • Ongoing and future work
    • Conclusion
    8

    View Slide

  29. Philipp Haller
    Data Races: A Concurrency Hazard
    9

    View Slide

  30. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    9

    View Slide

  31. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    • A data race occurs
    9

    View Slide

  32. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    • A data race occurs
    – when two tasks (threads, processes, actors) concurrently access the
    same shared variable (or object field) and
    9

    View Slide

  33. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    • A data race occurs
    – when two tasks (threads, processes, actors) concurrently access the
    same shared variable (or object field) and
    – at least one of the accesses is a write (an assignment)
    9

    View Slide

  34. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    • A data race occurs
    – when two tasks (threads, processes, actors) concurrently access the
    same shared variable (or object field) and
    – at least one of the accesses is a write (an assignment)
    • In practice, data races are difficult to find and fix
    9

    View Slide

  35. Philipp Haller
    Data Races: A Concurrency Hazard
    • What is a data race?
    • A data race occurs
    – when two tasks (threads, processes, actors) concurrently access the
    same shared variable (or object field) and
    – at least one of the accesses is a write (an assignment)
    • In practice, data races are difficult to find and fix
    • They can have dramatic consequences…
    9

    View Slide

  36. Philipp Haller 10

    View Slide

  37. Philipp Haller 10

    View Slide

  38. Philipp Haller 10

    View Slide

  39. Philipp Haller 10
    The Northeast blackout of 2003: a widespread power outage throughout
    parts of the Northeastern and Midwestern US and the Canadian province of
    Ontario on August 14, 2003
    Primary cause: a data-race bug in the alarm system at the control room of
    FirstEnergy Corporation

    View Slide

  40. Philipp Haller
    Problem
    Most widely-used pogramming languages cannot ensure data-race
    safety for their provided or enabled concurrency abstractions
    11

    View Slide

  41. Philipp Haller
    Problem
    Most widely-used pogramming languages cannot ensure data-race
    safety for their provided or enabled concurrency abstractions
    11
    IEEE Spectrum ranking "Top Programming Languages 2018" ("Trending" preset)
    https://spectrum.ieee.org/static/interactive-the-top-programming-languages-2018

    View Slide

  42. Philipp Haller
    Goal
    12

    View Slide

  43. Philipp Haller
    Goal
    Static prevention of data races
    12

    View Slide

  44. Philipp Haller
    Goal
    Static prevention of data races
    • using a lightweight type system
    12

    View Slide

  45. Philipp Haller
    Goal
    Static prevention of data races
    • using a lightweight type system
    • that minimizes the effort to reuse existing code
    12

    View Slide

  46. Philipp Haller
    Goal
    Static prevention of data races
    • using a lightweight type system
    • that minimizes the effort to reuse existing code
    Focus:
    12

    View Slide

  47. Philipp Haller
    Goal
    Static prevention of data races
    • using a lightweight type system
    • that minimizes the effort to reuse existing code
    Focus:
    • Existing, full-featured languages like Scala
    12

    View Slide

  48. Philipp Haller
    Goal
    Static prevention of data races
    • using a lightweight type system
    • that minimizes the effort to reuse existing code
    Focus:
    • Existing, full-featured languages like Scala
    12
    In contrast to new language
    designs like Rust

    View Slide

  49. Philipp Haller
    State of the Art
    13

    View Slide

  50. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    13

    View Slide

  51. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    • Challenges:
    13

    View Slide

  52. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    • Challenges:
    – Sound integration with advanced type system features
    13

    View Slide

  53. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    • Challenges:
    – Sound integration with advanced type system features
    13
    Example: local type
    inference

    View Slide

  54. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    • Challenges:
    – Sound integration with advanced type system features
    – Adoption on large scale
    13
    Example: local type
    inference

    View Slide

  55. Philipp Haller
    State of the Art
    • A lot of progress in type systems for safe concurrency (linear and affine
    types, static capabilities, uniqueness types, ownership types, region
    inference, etc.)
    • Challenges:
    – Sound integration with advanced type system features
    – Adoption on large scale
    • Key: reuse of existing code
    13
    Example: local type
    inference

    View Slide

  56. Philipp Haller
    Example
    Image
    data
    14

    View Slide

  57. Philipp Haller
    Example
    Image
    data
    apply filter
    14

    View Slide

  58. Philipp Haller
    Example
    Image
    data
    apply filter
    14

    View Slide

  59. Philipp Haller
    Example
    Image
    data
    apply filter
    Image processing pipeline:
    filter 1 filter 2
    14

    View Slide

  60. Philipp Haller
    Example
    Image
    data
    apply filter
    Image processing pipeline:
    filter 1 filter 2
    14
    Pipeline stages run
    concurrently

    View Slide

  61. Philipp Haller
    Example: Implementation
    15

    View Slide

  62. Philipp Haller
    Example: Implementation
    • Assumptions:
    – Main memory expensive
    – Image data large
    15

    View Slide

  63. Philipp Haller
    Example: Implementation
    • Assumptions:
    – Main memory expensive
    – Image data large
    • Approach for high performance:
    – Each pipeline stage is a concurrent actor
    – In-place update of image buffers
    – Pass mutable buffers by reference between actors
    15

    View Slide

  64. Philipp Haller
    Example: Problem
    Easy to produce data races:
    1. Stage 1 sends a reference to a buffer to stage 2
    2. Following the send, both stages have a reference to the same buffer
    3. Stages can concurrently access the buffer
    16

    View Slide

  65. Philipp Haller
    Preventing Data Races
    17

    View Slide

  66. Philipp Haller
    Preventing Data Races
    Approach:
    17

    View Slide

  67. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    17

    View Slide

  68. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    17

    View Slide

  69. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    • Variables of affine type may be "used" at most once
    17

    View Slide

  70. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    • Variables of affine type may be "used" at most once
    • "Used" = consumed
    17

    View Slide

  71. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    • Variables of affine type may be "used" at most once
    • "Used" = consumed
    • A consumed variable cannot be accessed any more
    17

    View Slide

  72. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    • Variables of affine type may be "used" at most once
    • "Used" = consumed
    • A consumed variable cannot be accessed any more
    – Values of affine type are called permissions in our system
    17

    View Slide

  73. Philipp Haller
    Preventing Data Races
    Approach:
    – Extend type system with affine types and object capabilities
    – Affine types:
    • Variables of affine type may be "used" at most once
    • "Used" = consumed
    • A consumed variable cannot be accessed any more
    – Values of affine type are called permissions in our system
    – Permissions control access to transferable objects
    17

    View Slide

  74. Philipp Haller
    Guarantee of the Type System
    18
    Exchanging transferable objects between actors
    preserves actor isolation

    View Slide

  75. Philipp Haller
    LaCasa: An Extension of Scala with Affine Types and
    Object Capabilities
    Transferable objects: instances of a new generic type Box[T]
    19

    View Slide

  76. Philipp Haller
    LaCasa: An Extension of Scala with Affine Types and
    Object Capabilities
    Transferable objects: instances of a new generic type Box[T]
    19
    def receive(box: Box[Message]): Unit = {
    box open { msg =>
    msg.buffer = Array(1, 2, 3, 4)
    }
    ...
    }
    class Message {
    var buffer: Array[Byte] = _
    }

    View Slide

  77. Philipp Haller
    LaCasa: An Extension of Scala with Affine Types and
    Object Capabilities
    Transferable objects: instances of a new generic type Box[T]
    19
    def receive(box: Box[Message]): Unit = {
    box open { msg =>
    msg.buffer = Array(1, 2, 3, 4)
    }
    ...
    }
    class Message {
    var buffer: Array[Byte] = _
    }
    Accessing an
    encapsulated object
    requires the use of
    open

    View Slide

  78. Philipp Haller
    LaCasa: An Extension of Scala with Affine Types and
    Object Capabilities
    Transferable objects: instances of a new generic type Box[T]
    19
    def receive(box: Box[Message]): Unit = {
    box open { msg =>
    msg.buffer = Array(1, 2, 3, 4)
    }
    ...
    }
    class Message {
    var buffer: Array[Byte] = _
    }
    msg is the
    encapsulated object
    Accessing an
    encapsulated object
    requires the use of
    open

    View Slide

  79. Philipp Haller
    Permissions
    20

    View Slide

  80. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    20

    View Slide

  81. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    20

    View Slide

  82. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    20
    CanAccess { type C = box.C }

    View Slide

  83. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    20
    CanAccess { type C = box.C }
    Dependent type

    View Slide

  84. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    • Type member C links the permission type to a specific box
    20
    CanAccess { type C = box.C }
    Dependent type

    View Slide

  85. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    • Type member C links the permission type to a specific box
    • A permission type CanAccess { type C = låda.C } would only be
    compatible with box iff
    20
    CanAccess { type C = box.C }
    Dependent type

    View Slide

  86. Philipp Haller
    Permissions
    • The above code is still incomplete: opening a box requires a

    corresponding permission provided by the context
    • Invoking open on box requires a permission with the following type:
    • Type member C links the permission type to a specific box
    • A permission type CanAccess { type C = låda.C } would only be
    compatible with box iff
    – box and låda are aliases (statically-known)
    20
    CanAccess { type C = box.C }
    Dependent type

    View Slide

  87. Philipp Haller
    Permissions (2)
    Making permissions available in the context via implicit parameters:
    21

    View Slide

  88. Philipp Haller
    Permissions (2)
    Making permissions available in the context via implicit parameters:
    21
    def receive(box: Box[Message])
    (implicit p: CanAccess { type C = box.C }): Unit = {
    box open { msg =>
    msg.buffer = Array(1, 2, 3, 4)
    }
    ...
    }

    View Slide

  89. Philipp Haller
    Consuming Permissions
    Transfering a box from one actor to another consumes its access permission:
    22

    View Slide

  90. Philipp Haller
    Consuming Permissions
    Transfering a box from one actor to another consumes its access permission:
    22
    def receive(box: Box[Message])
    (implicit p: CanAccess { type C = box.C }): Unit = {
    box open { msg =>
    msg.buffer = Array(1, 2, 3, 4)
    }
    ...
    someActor.send(box) {
    // `p` unavailable here!
    ...
    }
    }

    View Slide

  91. Philipp Haller
    Encapsulation
    Problem: not all types safe to transfer!
    23

    View Slide

  92. Philipp Haller
    Encapsulation
    Problem: not all types safe to transfer!
    23
    class Message {
    var buffer: Array[Int] = _
    def leak(): Unit = {
    SomeObject.fld = buffer
    }
    }
    object SomeObject {
    var fld: Array[Int] = _
    }

    View Slide

  93. Philipp Haller
    Encapsulation
    24
    * simplified

    View Slide

  94. Philipp Haller
    Encapsulation
    • Ensuring absence of data races requires restricting types put into boxes
    24
    * simplified

    View Slide

  95. Philipp Haller
    Encapsulation
    • Ensuring absence of data races requires restricting types put into boxes
    • Requirements for “safe” classes:*
    – Methods only access parameters and this
    – Method parameter types are “safe”
    – Methods only instantiate “safe” classes
    – Types of fields are “safe”
    24
    * simplified

    View Slide

  96. Philipp Haller
    Encapsulation
    • Ensuring absence of data races requires restricting types put into boxes
    • Requirements for “safe” classes:*
    – Methods only access parameters and this
    – Method parameter types are “safe”
    – Methods only instantiate “safe” classes
    – Types of fields are “safe”
    24
    “Safe” = conforms to object capability model [4]
    * simplified
    [4] Mark S. Miller. Robust Composition: Towards a Unified Approach to
    Access Control and Concurrency Control. PhD thesis, 2006

    View Slide

  97. Philipp Haller
    Object Capabilities in Scala
    25

    View Slide

  98. Philipp Haller
    Object Capabilities in Scala
    • How common is object-capability safe code in Scala?
    25

    View Slide

  99. Philipp Haller
    Object Capabilities in Scala
    • How common is object-capability safe code in Scala?
    • Empirical study of over 75,000 SLOC of open-source Scala code:
    25

    View Slide

  100. Philipp Haller
    Object Capabilities in Scala
    • How common is object-capability safe code in Scala?
    • Empirical study of over 75,000 SLOC of open-source Scala code:
    25
    Project Version SLOC GitHub stats
    Scala stdlib 2.11.7 33,107 ✭5,795 257
    Signal/Collect 8.0.6 10,159 ✭123 11
    GeoTrellis 0.10.0-RC2 35,351 ✭400 38
    -engine 3,868
    -raster 22,291
    -spark 9,192

    View Slide

  101. Philipp Haller
    Object Capabilities in Scala
    Results of empirical study:
    26
    Project #classes/traits #ocap (%) #dir. insec. (%)
    Scala stdlib 1,505 644 (43%) 212/861 (25%)
    Signal/
    Collect
    236 159 (67%) 60/77 (78%)
    GeoTrellis
    -engine 190 40 (21%) 124/150 (83%)
    -raster 670 233 (35%) 325/437 (74%)
    -spark 326 101 (31%) 167/225 (74%)
    Total 2,927 1,177
    (40%)
    888/1,750 (51%)

    View Slide

  102. Philipp Haller
    Object Capabilities in Scala
    Results of empirical study:
    26
    Project #classes/traits #ocap (%) #dir. insec. (%)
    Scala stdlib 1,505 644 (43%) 212/861 (25%)
    Signal/
    Collect
    236 159 (67%) 60/77 (78%)
    GeoTrellis
    -engine 190 40 (21%) 124/150 (83%)
    -raster 670 233 (35%) 325/437 (74%)
    -spark 326 101 (31%) 167/225 (74%)
    Total 2,927 1,177
    (40%)
    888/1,750 (51%)

    View Slide

  103. Philipp Haller
    Object Capabilities in Scala
    Results of empirical study:
    26
    Project #classes/traits #ocap (%) #dir. insec. (%)
    Scala stdlib 1,505 644 (43%) 212/861 (25%)
    Signal/
    Collect
    236 159 (67%) 60/77 (78%)
    GeoTrellis
    -engine 190 40 (21%) 124/150 (83%)
    -raster 670 233 (35%) 325/437 (74%)
    -spark 326 101 (31%) 167/225 (74%)
    Total 2,927 1,177
    (40%)
    888/1,750 (51%)
    Immutability inference increases these percentages!

    View Slide

  104. Philipp Haller
    Further Results
    27

    View Slide

  105. Philipp Haller
    Further Results
    • Object-oriented core languages
    27

    View Slide

  106. Philipp Haller
    Further Results
    • Object-oriented core languages
    – Formalization of object capabilities (type-based), uniqueness,
    separation, concurrency
    27

    View Slide

  107. Philipp Haller
    Further Results
    • Object-oriented core languages
    – Formalization of object capabilities (type-based), uniqueness,
    separation, concurrency
    • Meta theory
    27

    View Slide

  108. Philipp Haller
    Further Results
    • Object-oriented core languages
    – Formalization of object capabilities (type-based), uniqueness,
    separation, concurrency
    • Meta theory
    – Type soundness
    27

    View Slide

  109. Philipp Haller
    Further Results
    • Object-oriented core languages
    – Formalization of object capabilities (type-based), uniqueness,
    separation, concurrency
    • Meta theory
    – Type soundness
    – Isolation theorem for processes with shared heap
    27

    View Slide

  110. Philipp Haller
    Further Results
    • Object-oriented core languages
    – Formalization of object capabilities (type-based), uniqueness,
    separation, concurrency
    • Meta theory
    – Type soundness
    – Isolation theorem for processes with shared heap
    • Paper:
    27
    [5] Haller and Loiko. LaCasa: Lightweight affinity and object capabilities in Scala. OOPSLA 2016

    View Slide

  111. Philipp Haller
    Ongoing Work
    28

    View Slide

  112. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    28

    View Slide

  113. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    28
    [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala.
    Master's thesis, KTH, 2017 (supervisor Philipp Haller)

    View Slide

  114. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    • Empirical studies
    28
    [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala.
    Master's thesis, KTH, 2017 (supervisor Philipp Haller)

    View Slide

  115. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    • Empirical studies
    – How much effort to change existing code?
    28
    [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala.
    Master's thesis, KTH, 2017 (supervisor Philipp Haller)

    View Slide

  116. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    • Empirical studies
    – How much effort to change existing code?
    28
    [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala.
    Master's thesis, KTH, 2017 (supervisor Philipp Haller)
    [7] Haller, Sommar. Towards an Empirical Study of Affine Types for Isolated Actors in
    Scala. [email protected] 2017

    View Slide

  117. Philipp Haller
    Ongoing Work
    • Flow-sensitive type checking
    • Empirical studies
    – How much effort to change existing code?
    • Complete mechanization of meta-theory
    28
    [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala.
    Master's thesis, KTH, 2017 (supervisor Philipp Haller)
    [7] Haller, Sommar. Towards an Empirical Study of Affine Types for Isolated Actors in
    Scala. [email protected] 2017

    View Slide

  118. Philipp Haller
    LaCasa: Conclusion
    29

    View Slide

  119. Philipp Haller
    LaCasa: Conclusion
    • Preserving actor isolation is possible when transfering objects
    conforming to the object capability discipline
    29

    View Slide

  120. Philipp Haller
    LaCasa: Conclusion
    • Preserving actor isolation is possible when transfering objects
    conforming to the object capability discipline
    – Binary check whether a class is reusable unchanged
    29

    View Slide

  121. Philipp Haller
    LaCasa: Conclusion
    • Preserving actor isolation is possible when transfering objects
    conforming to the object capability discipline
    – Binary check whether a class is reusable unchanged
    • Integration with the full Scala language
    29

    View Slide

  122. Philipp Haller
    LaCasa: Conclusion
    • Preserving actor isolation is possible when transfering objects
    conforming to the object capability discipline
    – Binary check whether a class is reusable unchanged
    • Integration with the full Scala language
    • In medium to large open-source Scala projects, 21-67% of all classes
    conform to the object capability discipline
    29

    View Slide

  123. Philipp Haller
    Overview
    • Motivation
    • Part 1: Type systems for data-race safe concurrency
    • Part 2: Practical deterministic concurrency
    • Part 3: Lineage-based distributed programming
    • Ongoing and future work
    • Conclusion
    30

    View Slide

  124. Philipp Haller
    From Data-Race Freedom to Determinism
    • LaCasa: prevent data races via type system
    • However, due to non-determinism concurrent programs difficult to reason
    about even when they are data-race free
    31

    View Slide

  125. Philipp Haller
    Non-Determinism: Example
    32

    View Slide

  126. Philipp Haller
    Non-Determinism: Example
    Example 1:
    32
    @volatile var x = 0
    def m(): Unit = {
    Future {
    x = 1
    }
    Future {
    x = 2
    }
    .. // does not access x
    }

    View Slide

  127. Philipp Haller
    Non-Determinism: Example
    Example 1:
    32
    @volatile var x = 0
    def m(): Unit = {
    Future {
    x = 1
    }
    Future {
    x = 2
    }
    .. // does not access x
    }
    What’s the value of x
    when an invocation
    of m returns?

    View Slide

  128. Philipp Haller
    Reordering not always a problem
    Example 2:
    33

    View Slide

  129. Philipp Haller
    Reordering not always a problem
    Example 2:
    33
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    set.put(2)

    View Slide

  130. Philipp Haller
    Reordering not always a problem
    Example 2:
    33
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    set.put(2)
    Assume:
    concurrent set

    View Slide

  131. Philipp Haller
    Reordering not always a problem
    Example 2:
    33
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    set.put(2)
    Eventually, set contains
    both 1 and 2, always
    Assume:
    concurrent set

    View Slide

  132. Philipp Haller
    Reordering not always a problem
    Example 2:
    33
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    set.put(2)
    Eventually, set contains
    both 1 and 2, always
    Bottom line: it depends on the datatype
    Assume:
    concurrent set

    View Slide

  133. Philipp Haller
    Non-Commutative Operations
    Example 3:
    34

    View Slide

  134. Philipp Haller
    Non-Commutative Operations
    Example 3:
    34
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    Future {
    if (set.contains(1)) {
    ..
    }
    }
    set.put(2)

    View Slide

  135. Philipp Haller
    Non-Commutative Operations
    Example 3:
    34
    val set = Set.empty[Int]
    Future {
    set.put(1)
    }
    Future {
    if (set.contains(1)) {
    ..
    }
    }
    set.put(2)
    Result depends
    on schedule!

    View Slide

  136. Philipp Haller
    Goal
    35

    View Slide

  137. Philipp Haller
    Goal
    • Programming model providing static determinism guarantees
    35

    View Slide

  138. Philipp Haller
    Goal
    • Programming model providing static determinism guarantees
    • More precisely:
    35

    View Slide

  139. Philipp Haller
    Goal
    • Programming model providing static determinism guarantees
    • More precisely:
    35
    "All non-failing executions
    compute the same result."

    View Slide

  140. Philipp Haller
    Goal
    • Programming model providing static determinism guarantees
    • More precisely:
    35
    "All non-failing executions
    compute the same result."
    "Quasi-determinism" [8]
    [8] Kuper et al. Freeze after writing: quasi-deterministic parallel programming with LVars. POPL 2014

    View Slide

  141. Philipp Haller
    Important Concerns
    36

    View Slide

  142. Philipp Haller
    Important Concerns
    • Starting from imperative, object-oriented language
    36

    View Slide

  143. Philipp Haller
    Important Concerns
    • Starting from imperative, object-oriented language
    – global state
    36

    View Slide

  144. Philipp Haller
    Important Concerns
    • Starting from imperative, object-oriented language
    – global state
    – pervasive aliasing
    36

    View Slide

  145. Philipp Haller
    Important Concerns
    • Starting from imperative, object-oriented language
    – global state
    – pervasive aliasing
    36
    Potential
    application to widely-
    used languages

    View Slide

  146. Philipp Haller
    Important Concerns
    • Starting from imperative, object-oriented language
    – global state
    – pervasive aliasing
    • Important concerns: expressivity and performance
    36
    Potential
    application to widely-
    used languages

    View Slide

  147. Philipp Haller
    Reactive Async: Approach
    37

    View Slide

  148. Philipp Haller
    Reactive Async: Approach
    • New programming model building on:
    37

    View Slide

  149. Philipp Haller
    Reactive Async: Approach
    • New programming model building on:
    – event-driven concurrency (similar to futures and promises)
    37

    View Slide

  150. Philipp Haller
    Reactive Async: Approach
    • New programming model building on:
    – event-driven concurrency (similar to futures and promises)
    – lattice-based data types
    37

    View Slide

  151. Philipp Haller
    Reactive Async: Approach
    • New programming model building on:
    – event-driven concurrency (similar to futures and promises)
    – lattice-based data types
    – reactive programming
    37

    View Slide

  152. Philipp Haller
    Reactive Async: Approach
    • New programming model building on:
    – event-driven concurrency (similar to futures and promises)
    – lattice-based data types
    – reactive programming
    • Build on LaCasa's type system to provide quasi-determinism guarantee at
    compile time
    37

    View Slide

  153. Philipp Haller
    Application: Static Program Analysis
    Example: return type analysis
    38

    View Slide

  154. Philipp Haller
    Application: Static Program Analysis
    Example: return type analysis
    38
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  155. Philipp Haller
    Application: Static Program Analysis
    Example: return type analysis
    38
    Class type hierarchy:
    D
    E F G
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  156. Philipp Haller
    Application: Static Program Analysis
    Example: return type analysis
    38
    Which types does
    method f possibly return?
    Class type hierarchy:
    D
    E F G
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  157. Philipp Haller
    Return Type Analysis (cont'd)
    39
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  158. Philipp Haller
    Return Type Analysis (cont'd)
    • Method f calls methods g and h; method h calls method f
    39
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  159. Philipp Haller
    Return Type Analysis (cont'd)
    • Method f calls methods g and h; method h calls method f
    • Programming model let's us express the resulting dependencies, forming a
    directed graph:
    39
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }

    View Slide

  160. Philipp Haller
    Return Type Analysis (cont'd)
    • Method f calls methods g and h; method h calls method f
    • Programming model let's us express the resulting dependencies, forming a
    directed graph:
    39
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }
    f g
    h

    View Slide

  161. Philipp Haller
    Return Type Analysis (cont'd)
    • Method f calls methods g and h; method h calls method f
    • Programming model let's us express the resulting dependencies, forming a
    directed graph:
    39
    class C {
    def f(x: Int): D =
    if (x <= 0) g(x) else h(x-1)
    def g(y: Int): E = new E(y)
    def h(z: Int): D =
    if (z == 0) new F else f(z)
    }
    f g
    h
    "calls"

    View Slide

  162. Philipp Haller
    Evaluation
    40

    View Slide

  163. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    40

    View Slide

  164. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    – Concurrent design
    40

    View Slide

  165. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    – Concurrent design
    • Rewrote purity analysis and immutability analysis
    40

    View Slide

  166. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    – Concurrent design
    • Rewrote purity analysis and immutability analysis
    40
    http://www.opal-project.de

    View Slide

  167. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    – Concurrent design
    • Rewrote purity analysis and immutability analysis
    • Ran analysis on JDK 8 update 45
    40
    http://www.opal-project.de

    View Slide

  168. Philipp Haller
    Evaluation
    • Static analysis of JVM bytecode using the OPAL framework (OPen
    extensible Analysis Library)
    – Concurrent design
    • Rewrote purity analysis and immutability analysis
    • Ran analysis on JDK 8 update 45
    – 18’591 class files, 163’268 methods, 77’128 fields
    40
    http://www.opal-project.de

    View Slide

  169. Philipp Haller
    Results: Immutability Analysis
    • RA about 10x faster than FPCF (OPAL's fixed point computation framework)
    • RA = 294 LOC, FPCF = 424 LOC (1.44x)
    41
    FPCF (secs.)
    1.0
    1.5
    2.0
    2.5
    Reactive-Async (secs.)
    0.1
    0.2
    0.3
    1 Thread 2 Threads 4 Threads 8 Threads 16 Threads
    2.15
    2.20
    2.25
    2.30
    2.35
    1.15
    1.20
    1.25
    1.30
    1.35
    0.290
    0.295
    0.300
    0.105
    0.110
    0.115

    View Slide

  170. Philipp Haller
    Results: Immutability Analysis
    • RA about 10x faster than FPCF (OPAL's fixed point computation framework)
    • RA = 294 LOC, FPCF = 424 LOC (1.44x)
    41
    FPCF (secs.)
    1.0
    1.5
    2.0
    2.5
    Reactive-Async (secs.)
    0.1
    0.2
    0.3
    1 Thread 2 Threads 4 Threads 8 Threads 16 Threads
    2.15
    2.20
    2.25
    2.30
    2.35
    1.15
    1.20
    1.25
    1.30
    1.35
    0.290
    0.295
    0.300
    0.105
    0.110
    0.115
    box plot:
    whiskers = min/max

    top/bottom of box = 

    1st and 3rd quartile
    band in box: median

    View Slide

  171. Philipp Haller
    Reactive Async: Conclusion
    • Deterministic concurrent programming model
    – Extension of imperative, object-oriented base language
    – Resolution of cyclic dependencies
    – Type system for object capabilities for safety
    • Experimental evaluation using large-scale, concurrent static analysis
    • Prototype implementation: https://github.com/phaller/reactive-async
    42

    View Slide

  172. Philipp Haller
    Reactive Async: Conclusion
    • Deterministic concurrent programming model
    – Extension of imperative, object-oriented base language
    – Resolution of cyclic dependencies
    – Type system for object capabilities for safety
    • Experimental evaluation using large-scale, concurrent static analysis
    • Prototype implementation: https://github.com/phaller/reactive-async
    42
    [9] Haller, Geries, Eichberg, and Salvaneschi.

    Reactive Async: Expressive deterministic concurrency. Scala Symposium 2016

    View Slide

  173. Philipp Haller
    Overview
    • Motivation
    • Part 1: Type systems for data-race safe concurrency
    • Part 2: Practical deterministic concurrency
    • Part 3: Lineage-based distributed programming
    • Ongoing and future work
    • Conclusion
    43

    View Slide

  174. Philipp Haller
    Lineage
    Which resources are required for producing a
    particular expected result?
    Lineage may record information about:
    Data sets read/transformed for producing result data set
    44
    Etc.
    Services used for producing response

    View Slide

  175. Philipp Haller
    Distributed programming with functional
    lineages a.k.a. function passing
    New data-centric programming model for functional
    processing of distributed data.
    45

    View Slide

  176. Philipp Haller
    Distributed programming with functional
    lineages a.k.a. function passing
    New data-centric programming model for functional
    processing of distributed data.
    Key ideas:
    45

    View Slide

  177. Philipp Haller
    Distributed programming with functional
    lineages a.k.a. function passing
    New data-centric programming model for functional
    processing of distributed data.
    Key ideas:
    45
    Utilize lineages for fault recovery

    View Slide

  178. Philipp Haller
    Distributed programming with functional
    lineages a.k.a. function passing
    New data-centric programming model for functional
    processing of distributed data.
    Key ideas:
    45
    Provide lineages by programming abstractions
    Utilize lineages for fault recovery

    View Slide

  179. Philipp Haller
    Distributed programming with functional
    lineages a.k.a. function passing
    New data-centric programming model for functional
    processing of distributed data.
    Key ideas:
    45
    Provide lineages by programming abstractions
    Keep data stationary (if possible), send functions
    Utilize lineages for fault recovery

    View Slide

  180. Philipp Haller
    The function passing model
    Introducing
    Consists of 3 parts:
    Silos: stationary, typed, immutable data
    containers
    SiloRefs: references to local or remote Silos.
    Spores [10]: safe, serializable functions.
    46

    View Slide

  181. Philipp Haller
    The function passing model
    Introducing
    Consists of 3 parts:
    Silos: stationary, typed, immutable data
    containers
    SiloRefs: references to local or remote Silos.
    Spores [10]: safe, serializable functions.
    46
    [10] Miller, Haller, and Odersky. Spores: a type-based foundation for closures
    in the age of concurrency and distribution. ECOOP 2014

    View Slide

  182. Philipp Haller
    The function passing model
    Some visual intuition of
    Master
    Worker
    47

    View Slide

  183. Philipp Haller
    The function passing model
    Some visual intuition of
    Silo SiloRef
    Master
    Worker
    47

    View Slide

  184. Philipp Haller
    The function passing model
    Some visual intuition of
    Master
    Worker
    47

    View Slide

  185. Philipp Haller
    The function passing model
    Some visual intuition of
    Master
    Worker
    47

    View Slide

  186. Philipp Haller
    Silos
    What are they?
    Silo[T]
    T
    SiloRef[T]
    Two parts.
    def apply
    def send
    def persist
    def unpersist
    SiloRef. Handle to a Silo.
    Silo. Typed, stationary data container.
    User interacts with SiloRef.
    SiloRefs come with 4 primitive operations.
    48

    View Slide

  187. Philipp Haller
    Silos
    What are they?
    Silo[T]
    T
    SiloRef[T]
    Primitive: apply
    Takes a function that is to be applied to the data in the
    silo associated with the SiloRef.
    Creates new silo to contain the data that the user-
    defined function returns; evaluation is deferred
    def apply[S](fun: T => SiloRef[S]): SiloRef[S]
    Enables interesting computation DAGs
    Deferred
    def apply
    def send
    def persist
    def unpersist
    49

    View Slide

  188. Philipp Haller
    Silos
    What are they?
    Silo[T]
    T
    SiloRef[T]
    Primitive: send
    Forces the built-up computation DAG to be sent to the
    associated node and applied.
    Future is completed with the result of the computation.
    def send(): Future[T]
    EAGER
    def apply
    def send
    def persist
    def unpersist
    50

    View Slide

  189. Philipp Haller
    Silos
    Silo[T]
    T
    SiloRef[T]
    Silo factories:
    Creates silo on given host populated with given value/text file/…
    object SiloRef {
    def populate[T](host: Host, value: T): SiloRef[T]
    def fromTextFile(host: Host, file: File): SiloRef[List[String]]
    ...
    }
    def apply
    def send
    def persist
    def unpersist
    Deferred
    What are they?
    51

    View Slide

  190. Philipp Haller
    Basic idea: apply/send
    Silo[T]
    Machine 1 Machine 2
    SiloRef[T]
    T
    52

    View Slide

  191. Philipp Haller
    Basic idea: apply/send
    Silo[T]
    Machine 1 Machine 2
    SiloRef[T]
    T
    SiloRef[S]
    )
    52

    View Slide

  192. Philipp Haller
    Basic idea: apply/send
    Silo[T]
    Machine 1 Machine 2
    SiloRef[T]
    λ
    T
    SiloRef[S]
    )
    52

    View Slide

  193. Philipp Haller
    )
    Basic idea: apply/send
    Silo[T]
    Machine 1 Machine 2
    SiloRef[T]
    λ
    T
    SiloRef[S]
    S
    Silo[S]
    )
    52

    View Slide

  194. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    53

    View Slide

  195. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    adults
    54

    View Slide

  196. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(spore {
    val localVehicles = vehicles // spore header
    ps =>
    localVehicles.apply(spore {
    val localps = ps // spore header
    vs =>
    SiloRef.populate(currentHost,
    localps.flatMap(p =>
    // list of (p, v) for a single person p
    vs.flatMap {
    v =>
    if (v.owner.name == p.name) List((p, v))
    else Nil
    }
    )
    adults
    owners
    vehicles
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    55

    View Slide

  197. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    56

    View Slide

  198. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    val sorted =
    adults.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.sortWith(p => p.age))
    })
    val labels =
    sorted.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.map(p => "Hi, " + p.name))
    })
    sorted
    labels
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    57

    View Slide

  199. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    sorted
    labels
    so far we just staged
    computation, we haven’t yet
    “kicked it off”.
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    val sorted =
    adults.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.sortWith(p => p.age))
    })
    val labels =
    sorted.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.map(p => "Hi, " + p.name))
    })
    58

    View Slide

  200. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    sorted
    labels
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    val sorted =
    adults.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.sortWith(p => p.age))
    })
    val labels =
    sorted.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.map(p => "Hi, " + p.name))
    })
    labels.send()
    59

    View Slide

  201. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    sorted
    labels λ
    List[Person]㱺List[String]
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    val sorted =
    adults.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.sortWith(p => p.age))
    })
    val labels =
    sorted.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.map(p => "Hi, " + p.name))
    })
    labels.send()
    59

    View Slide

  202. Philipp Haller
    More involved example
    Silo[List[Person]]
    Machine 1
    SiloRef[List[Person]]
    Let’s make an interesting DAG!
    Machine 2
    persons:
    val persons: SiloRef[List[Person]] = ...
    val vehicles: SiloRef[List[Vehicle]] = ...
    // adults that own a vehicle
    val owners = adults.apply(...)
    adults
    owners
    vehicles
    sorted
    labels
    Silo[List[String]]
    val adults =
    persons.apply(spore { ps =>
    val res = ps.filter(p => p.age >= 18)
    SiloRef.populate(currentHost, res)
    })
    val sorted =
    adults.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.sortWith(p => p.age))
    })
    val labels =
    sorted.apply(spore { ps =>
    SiloRef.populate(currentHost,
    ps.map(p => "Hi, " + p.name))
    })
    labels.send()
    59

    View Slide

  203. Philipp Haller
    A functional design for fault-tolerance
    A SiloRef is a lineage, a persistent (in the sense
    of functional programming) data structure.
    The lineage is the DAG of operations used to
    derive the data of a silo.
    Since the lineage is composed of spores, it is
    serializable. This means it can be persisted or
    transferred to other machines.
    Putting lineages to work
    60

    View Slide

  204. Philipp Haller
    Next: we formalize lineages, a concept from the
    database + systems communities, in the context of
    PL. Natural fit in context of functional programming!
    A functional design for fault-tolerance
    Putting lineages to work
    Formalization: typed, distributed core
    language with spores, silos, and futures.
    61

    View Slide

  205. Philipp Haller 62
    Abstract syntax

    View Slide

  206. Philipp Haller 63
    Local reduction and lineages

    View Slide

  207. Philipp Haller 63
    Local reduction and lineages

    View Slide

  208. Philipp Haller 64
    Distributed reduction

    View Slide

  209. Philipp Haller 65
    Type assignment

    View Slide

  210. Philipp Haller
    Properties of function passing model
    Formalization
    66

    View Slide

  211. Philipp Haller
    Properties of function passing model
    Formalization
    Subject reduction theorem guarantees
    preservation of types under reduction, as well as
    preservation of lineage mobility.
    66

    View Slide

  212. Philipp Haller
    Properties of function passing model
    Formalization
    Subject reduction theorem guarantees
    preservation of types under reduction, as well as
    preservation of lineage mobility.
    Progress theorem guarantees the finite
    materialization of remote, lineage-based data.
    66

    View Slide

  213. Philipp Haller
    Properties of function passing model
    Formalization
    Subject reduction theorem guarantees
    preservation of types under reduction, as well as
    preservation of lineage mobility.
    Progress theorem guarantees the finite
    materialization of remote, lineage-based data.
    66
    First correctness results for a programming model
    for lineage-based distributed computation.

    View Slide

  214. Philipp Haller
    Liveness property: finite materialization
    Properties
    67

    View Slide

  215. Philipp Haller
    Liveness property: finite materialization
    Properties
    67

    View Slide

  216. Philipp Haller
    Liveness property: finite materialization
    Properties
    67

    View Slide

  217. Philipp Haller
    Paper
    Details, proofs, etc.
    68
    [11] Haller, Miller, and Müller. A programming model and foundation for lineage-
    based distributed computation. Journal of Functional Programming 28 (2018): e7

    View Slide

  218. Philipp Haller
    Summary
    69

    View Slide

  219. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    69

    View Slide

  220. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    69

    View Slide

  221. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    – For an imperative, object-oriented language
    69

    View Slide

  222. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    – For an imperative, object-oriented language
    – Type system guarantees quasi-determinism at compile time
    69

    View Slide

  223. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    – For an imperative, object-oriented language
    – Type system guarantees quasi-determinism at compile time
    • Lineage-based distributed programming
    69

    View Slide

  224. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    – For an imperative, object-oriented language
    – Type system guarantees quasi-determinism at compile time
    • Lineage-based distributed programming
    – First correctness results for a lineage-based distributed programming
    model
    69

    View Slide

  225. Philipp Haller
    Summary
    • LaCasa: provably safe software isolation, code reuse via object capabilities
    • Reactive Async: practical deterministic concurrency
    – For an imperative, object-oriented language
    – Type system guarantees quasi-determinism at compile time
    • Lineage-based distributed programming
    – First correctness results for a lineage-based distributed programming
    model
    • Finite materialization of distributed, lineage-based data
    69

    View Slide

  226. Philipp Haller
    Ongoing and Future Work
    70

    View Slide

  227. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State

    View Slide

  228. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State
    [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)

    View Slide

  229. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State Security & Privacy
    Privacy-aware
    distribution
    Information-
    flow security
    [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)

    View Slide

  230. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State Security & Privacy
    Privacy-aware
    distribution
    Information-
    flow security
    [13] Salvaneschi, Köhler, Haller, Erdweg, and Mezini. Language-Integrated Privacy-Aware
    Distributed Queries. 2018, draft
    [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)

    View Slide

  231. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State Security & Privacy
    Privacy-aware
    distribution
    Information-
    flow security
    [13] Salvaneschi, Köhler, Haller, Erdweg, and Mezini. Language-Integrated Privacy-Aware
    Distributed Queries. 2018, draft
    [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)
    Chaos Engineering
    Testing hypotheses
    about resilience in
    production systems

    View Slide

  232. Philipp Haller
    Ongoing and Future Work
    70
    Consistency, availability,
    partition tolerance
    Determinism
    Distributed Shared State Security & Privacy
    Privacy-aware
    distribution
    Information-
    flow security
    [13] Salvaneschi, Köhler, Haller, Erdweg, and Mezini. Language-Integrated Privacy-Aware
    Distributed Queries. 2018, draft
    [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)
    Chaos Engineering
    Testing hypotheses
    about resilience in
    production systems
    [14] Zhang, Morin, Haller, Baudry, Monperrus. A Chaos Engineering System for Live Analysis
    and Falsification of Exception-handling in the JVM. CoRR abs/1805.05246 (2018)

    View Slide

  233. Philipp Haller
    Organization of Scientific Meetings
    • NII Shonan Meeting:

    Haller, Salvaneschi, Watanabe, Agha.

    "Programming Languages for Distributed Systems", May 27–30, 2019
    • Dagstuhl Seminar:

    Haller, Lopes, Markl, Salvaneschi.

    "Programming Languages for Distributed Systems and Distributed Data
    Management" (65-0618), October 28–31, 2019
    71

    View Slide

  234. Philipp Haller
    References
    • [1]: http://store.steampowered.com/stats/content/
    • [2]: https://www.itbusinessedge.com/cm/blogs/lawson/the-big-data-software-problem-behind-cerns-higgs-boson-hunt/?cs=50736
    • [3]: https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/
    • [4] Mark S. Miller. Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. PhD thesis, 2006
    • [5] Haller and Loiko. LaCasa: Lightweight affinity and object capabilities in Scala. OOPSLA 2016
    • [6] Erik Reimers. Lightweight Software Isolation via Flow-Sensitive Capabilities in Scala. Master's thesis, KTH, 2017 (supervisor Philipp Haller)
    • [7] Haller, Sommar. Towards an Empirical Study of Affine Types for Isolated Actors in Scala. [email protected] 2017
    • [8] Kuper et al. Freeze after writing: quasi-deterministic parallel programming with LVars. POPL 2014
    • [9] Haller, Geries, Eichberg, and Salvaneschi. Reactive Async: Expressive Deterministic Concurrency. Scala Symposium 2016
    • [10] Miller, Haller, and Odersky. Spores: a type-based foundation for closures in the age of concurrency and distribution. ECOOP 2014
    • [11] Haller, Miller, and Müller. A programming model and foundation for lineage-based distributed computation. Journal of Functional
    Programming 28 (2018): e7
    • [12] Zhao, Haller. Observable atomic consistency for CvRDTs. CoRR abs/1802.09462 (2018)
    • [13] Salvaneschi, Köhler, Haller, Erdweg, and Mezini. Language-Integrated Privacy-Aware Distributed Queries. 2018, draft
    • [14] Zhang, Morin, Haller, Baudry, Monperrus. A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM.
    CoRR abs/1805.05246 (2018)
    72

    View Slide

  235. Philipp Haller
    Conclusion
    73

    View Slide

  236. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    73

    View Slide

  237. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    73
    Sound foundations and
    provable guarantees!

    View Slide

  238. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    • Methods:
    73
    Sound foundations and
    provable guarantees!

    View Slide

  239. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    • Methods:
    – Type systems: theory & practice
    73
    Sound foundations and
    provable guarantees!

    View Slide

  240. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    • Methods:
    – Type systems: theory & practice
    – Design and implementation of programming systems
    73
    Sound foundations and
    provable guarantees!

    View Slide

  241. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    • Methods:
    – Type systems: theory & practice
    – Design and implementation of programming systems
    – Empirical studies
    73
    Sound foundations and
    provable guarantees!

    View Slide

  242. Philipp Haller
    Conclusion
    • Goal:

    Robust construction of large-scale concurrent and distributed systems
    providing high availability, high scalability, and fault tolerance
    • Methods:
    – Type systems: theory & practice
    – Design and implementation of programming systems
    – Empirical studies
    73
    Sound foundations and
    provable guarantees!
    Thank You!

    View Slide