Slide 1

Slide 1 text

Heather Miller SPORES Improving Support for Distributed Programmingin @heathercmiller heather.miller@epfl.ch Distributable functions : in

Slide 2

Slide 2 text

PICKLES Heather Miller SPORES & Improving Support for Distributed Programmingin @heathercmiller heather.miller@epfl.ch

Slide 3

Slide 3 text

with: Philipp Haller Eugene Burmako Martin Odersky Typesafe EPFL EPFL/Typesafe

Slide 4

Slide 4 text

What is this talk about?

Slide 5

Slide 5 text

What is this talk about? Making distributed programming easier in scala

Slide 6

Slide 6 text

This kind of distributed system

Slide 7

Slide 7 text

This kind of distributed system insert social network of your choice here !

Slide 8

Slide 8 text

Bottomline: M C

Slide 9

Slide 9 text

Bottomline: How can we simplify distribution at the language-level? M C

Slide 10

Slide 10 text

Spores pickling

Slide 11

Slide 11 text

Spores pickling 75%

Slide 12

Slide 12 text

Spores pickling 75% 25%

Slide 13

Slide 13 text

THIS STUFF IS BOTH RESEARCH & intended for production Spores pickling

Slide 14

Slide 14 text

Spores pickling Instant Pickles: Generating Object-Oriented Pickler Combinators for Fast and Extensible Serialization, Heather Miller, Philipp Haller, Eugene Burmako, Martin Odersky. @OOPSLA’13, Indianapolis, IN, October 26-31, 2013. RESEARCH Accepted for publication at OOPSLA’13 Used by a handful of companies, Scala Language Proposal in the works Practice

Slide 15

Slide 15 text

Spores pickling

Slide 16

Slide 16 text

Spores pickling RESEARCH Academic paper on the foundations and practical benefits of Scala’s spores in the works. Draft will be available in the coming months. Practice Scala Improvement Proposal posted, lots of user feedback, helped reformulate the design. Release soon upcoming of what will come in the Scala 2.11 distribution.

Slide 17

Slide 17 text

SPORES! . p r

Slide 18

Slide 18 text

SINGLE MACHINE SCENARIO imagine the... r ,

Slide 19

Slide 19 text

closures are wonderful

Slide 20

Slide 20 text

are wonderful C r How do you do fp without them?

Slide 21

Slide 21 text

are wonderful C r Ok, ok, anonymous inner classes. But still. How do you do fp without them?

Slide 22

Slide 22 text

are wonderful C r Ok, ok, anonymous inner classes. But still. How do you do fp without them? fp is all about transformations on immutable data. These transformations are just closures.

Slide 23

Slide 23 text

are wonderful C r Ok, ok, anonymous inner classes. But still. How do you do fp without them? fp is all about transformations on immutable data. These transformations are just closures. monads backed by data, like lists, options or futures. Typically, you pass closures to higher-order functions.

Slide 24

Slide 24 text

are wonderful C r Ok, ok, anonymous inner classes. But still. How do you do fp without them? fp is all about transformations on immutable data. These transformations are just closures. monads backed by data, like lists, options or futures. Typically, you pass closures to higher-order functions. essentially, you’re sending the closure to the data.

Slide 25

Slide 25 text

are wonderful C r Ok, ok, anonymous inner classes. But still. How do you do fp without them? fp is all about transformations on immutable data. These transformations are just closures. monads backed by data, like lists, options or futures. Typically, you pass closures to higher-order functions. essentially, you’re sending the closure to the data. even java’s going to get closures O ,

Slide 26

Slide 26 text

distributed back to thinking O ,

Slide 27

Slide 27 text

Closures are awesome.

Slide 28

Slide 28 text

but we can’t really distributethem. Closures are awesome.

Slide 29

Slide 29 text

we can’t really distribute them B , WHY? because they capture stuff that’s not serializable. oFTEN NOT SERIALIZABLE Easy to reference something and unknowingly capture it. ACCIDENTAL CAPTURE. ...enclosing this, anyone?

Slide 30

Slide 30 text

we can’t really distribute them B , WHY? because they capture stuff that’s not serializable. oFTEN NOT SERIALIZABLE Easy to reference something and unknowingly capture it. ACCIDENTAL CAPTURE. ...enclosing this, anyone? instead of compile-time checks. runtime errors

Slide 31

Slide 31 text

we can’t really distribute them B , WHY? because they capture stuff that’s not serializable. oFTEN NOT SERIALIZABLE Easy to reference something and unknowingly capture it. ACCIDENTAL CAPTURE. ...enclosing this, anyone? instead of compile-time checks. runtime errors for a user, often unclear whether it’s a user-error or the framework Who’s fault is it?

Slide 32

Slide 32 text

we can’t really distribute them B , Consequences that follow from these problems...

Slide 33

Slide 33 text

we can’t really distribute them B , Consequences that follow from these problems... ...not just in their public APIs, but private ones too. framework builders avoid them Users shoot themselves in the foot and blame framework.

Slide 34

Slide 34 text

we can’t really distribute them B , Consequences that follow from these problems... ...not just in their public APIs, but private ones too. framework builders avoid them Users shoot themselves in the foot and blame framework. When picking battles, framework designers tend to avoid issues with closures.

Slide 35

Slide 35 text

we can’t really distribute them B , Consequences that follow from these problems... ...not just in their public APIs, but private ones too. framework builders avoid them Users shoot themselves in the foot and blame framework. rightfully so. When picking battles, framework designers tend to avoid issues with closures.

Slide 36

Slide 36 text

Have you heard this before?:

Slide 37

Slide 37 text

fUNCTIONAL PROGRAMMING The Way To Go™ ...for parallelism/concurrency/distribution Have you heard this before?:

Slide 38

Slide 38 text

fUNCTIONAL PROGRAMMING The Way To Go™ ...for parallelism/concurrency/distribution Have you heard this before?: ...And composing and passing functions around is the way to do FP. However, managing closures in a concurrent or distributed environment, or writing APIs to be used by clients in such an environment, remains considerably precarious.

Slide 39

Slide 39 text

But wouldn’t it be nice if we could... More reliably develop new frameworks for distributed programming based on passing functions ? develop new types of frameworks passing functions ...foolproof dist collections, dist streams, ...

Slide 40

Slide 40 text

eNTER: spores

Slide 41

Slide 41 text

eNTER: spores w : 1 spores with type constraints 2 mainline spores proposed for inclusion in Scala 2.11 research project @EPFL, research paper in the works

Slide 42

Slide 42 text

2 spores w : spores with type constraints research project @EPFL, research paper in the works proposed for inclusion in Scala 2.11 mainline spores 1

Slide 43

Slide 43 text

Spores what are they? http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 behavior small units of possibly mobile functional

Slide 44

Slide 44 text

Spores what are they? A closure-like abstraction for use in distributed or concurrent environments. goal: Well-behaved closures with controlled environments that can avoid various hazards. http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11

Slide 45

Slide 45 text

Spores what are they? A closure-like abstraction for use in distributed or concurrent environments. goal: Well-behaved closures with controlled environments that can avoid various hazards. http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 Potential hazards when using closures incorrectly: • memory leaks • race conditions due to capturing mutable references • runtime serialization errors due to unintended capture of references

Slide 46

Slide 46 text

spark v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 class MyCoolRddApp { val param = 3.14 val log = new Log(...) ... def work(rdd: RDD[Int]) { rdd.map(x => x + param) .reduce(...) } }

Slide 47

Slide 47 text

spark v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 class MyCoolRddApp { val param = 3.14 val log = new Log(...) ... def work(rdd: RDD[Int]) { rdd.map(x => x + param) .reduce(...) } } Problem: not serializable because it captures this of type MyCoolRddApp which is itself not serializable (x => x + param)

Slide 48

Slide 48 text

Akka/futures v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 def  receive  =  {  case  Request(data)  =>      future  {          val  result  =  transform(data)          sender  !  Response(result)      } } Problem: Akka actor spawns future to concurrently process incoming results

Slide 49

Slide 49 text

Akka/futures v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 def  receive  =  {  case  Request(data)  =>      future  {          val  result  =  transform(data)          sender  !  Response(result)      } } Problem: Akka actor spawns future to concurrently process incoming results akka actor spawns a future to concurrently process incoming reqs

Slide 50

Slide 50 text

Akka/futures v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 def  receive  =  {  case  Request(data)  =>      future  {          val  result  =  transform(data)          sender  !  Response(result)      } } Problem: Akka actor spawns future to concurrently process incoming results akka actor spawns a future to concurrently process incoming reqs not a stable value! it’s a method call!

Slide 51

Slide 51 text

Serialization v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 case class Helper(name: String) class Main { val helper = Helper("the helper") val fun: Int => Unit = (x: Int) => { val result = x + " " + helper.toString println("The result is: " + result) } } Problem: fun not serializable. Accidentally captures this since helper.toString is really this.helper.toString, and Main (the type of this) is not serializable.

Slide 52

Slide 52 text

Serialization v p : http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 case class Helper(name: String) class Main { val helper = Helper("the helper") val fun: Int => Unit = (x: Int) => { val result = x + " " + helper.toString println("The result is: " + result) } } Problem: fun not serializable. Accidentally captures this since helper.toString is really this.helper.toString, and Main (the type of this) is not serializable.

Slide 53

Slide 53 text

We need safer closures O . G . for concurrent & distributed scenarios. sure.

Slide 54

Slide 54 text

We need safer closures O . G . for concurrent & distributed scenarios. sure. what do these things look like?

Slide 55

Slide 55 text

What do spores look like? B : val  s  =  spore  {    val  h  =  helper    (x:  Int)  =>  {        val  result  =  x  +  "  "  +  h.toString        println("The  result  is:  "  +  result)    } } THE BODY OF A SPORE CONSISTS OF 2 PARTS 2 a closure a sequence of local value (val) declarations only (the “spore header”), and 1 http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11

Slide 56

Slide 56 text

Spore http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 1. All captured variables are declared in the spore header, or using capture 2. The initializers of captured variables are executed once, upon creation of the spore 3. References to captured variables do not change during the spore’s execution v closures ( ) G r ...

Slide 57

Slide 57 text

Spores& http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 closures Evaluation semantics: Remove the spore marker, and the code behaves as before spores & closures are related: You can write a full function literal and pass it to something that expects a spore. (Of course, only if the function literal satisfies the spore rules.)

Slide 58

Slide 58 text

How can you use a spore? O . S . In APIs http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 def  sendOverWire(s:  Spore[Int,  Int]):  Unit  =  ... //  ... sendOverWire((x:  Int)  =>  x  *  x  -­‐  2) If you want parameters to be spores, then you can write it this way

Slide 59

Slide 59 text

How can you use a spore? O . S . for-comprehensions http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 def  lookup(i:  Int):  DCollection[Int]  =  ... val  indices:  DCollection[Int]  =  ... for  {  i  <-­‐  indices            j  <-­‐  lookup(i) }  yield  j  +  capture(i) trait  DCollection[A]  {  def  map[B](sp:  Spore[A,  B]):  DCollection[B]  def  flatMap[B](sp:  Spore[A,  DCollection[B]]):  DCollection[B] }

Slide 60

Slide 60 text

R , get you? what does all of that

Slide 61

Slide 61 text

what does all of that http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 get you? Since... Captured expressions are evaluated upon spore creation. Spores are like function values with an immutable environment. Plus, environment is specified and checked, no accidental capturing. That means...

Slide 62

Slide 62 text

what does all of that http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 get you? or, graphically... During execution Right after creation Spores closures 1 2

Slide 63

Slide 63 text

what does all of that http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 get you? or, graphically... During execution Right after creation Spores closures 1 2 5 ‘a’ ? ?

Slide 64

Slide 64 text

what does all of that http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 get you? or, graphically... During execution Right after creation Spores closures 1 2 5 ‘a’ 5 ‘a’ ? ?

Slide 65

Slide 65 text

what does all of that http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 get you? or, graphically... During execution Right after creation Spores closures 1 2 5 ‘a’ 5 ‘a’ ? ? I’m in ur stuff draggin around ur object graf

Slide 66

Slide 66 text

So now that we have spores, what kind of Gr . cool patter can they enable? Cool Patterns

Slide 67

Slide 67 text

http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 eNABLED BY SPORES P r Stage and Ship Build up a computation graph such that the behavior is represented as spores. Once the computation has been built, it can be safely shipped to remote nodes. could be pipeline stages that are plugged together

Slide 68

Slide 68 text

http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 eNABLED BY SPORES P r Function-passing Style Concurrency Pass spores between concurrent entities. Compose spores with other spores to form larger, composite spores, and then send.

Slide 69

Slide 69 text

http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11 eNABLED BY SPORES P r Hot-swapping actor behavior Have a running actor and want to change its behavior. Send it a spore and instruct the actor to use the spore from now on to process its incoming messages.

Slide 70

Slide 70 text

NEW PATTERNS ➡NEW FRAMEWORKS Distributed Collections Distributed pipelines, stream processing Function-passing style frameworks SOME IDEAS for potential FRAMEWORKS that spores might help http://docs.scala-lang.org/sips/pending/spores.html Proposed for inclusion in Scala 2.11

Slide 71

Slide 71 text

C .

Slide 72

Slide 72 text

C . Socket? what if I capture a

Slide 73

Slide 73 text

C . Socket? what if I capture a RESEARCH REALM OF

Slide 74

Slide 74 text

proposed for inclusion in Scala 2.11 mainline spores 1 spores w : 2 spores with type constraints research project @EPFL, research paper in the works

Slide 75

Slide 75 text

C . Socket? what if I capture a

Slide 76

Slide 76 text

We don't have the means yet for frameworks to express these kinds of constraints or to enforce them when we create and compose spores. RESEARCH Wouldn't it be nice if we could add these constraints, in a friendly, and composable way?

Slide 77

Slide 77 text

keep track of captured types RESEARCH I : The spore macro can synthesize precise types automatically for newly created spores: Spore[Int,  ...]  {    type  Excluded  =  NoCapture[Actor]    type  Facts  =  Captured[Int]  with  Captured[ActorRef] } w/ r Cr spores ...at compile-time spore  {  val  x:  Int  =  list.size;  val  a:  ActorRef  =  this.sender    (y:  Int)  =>  ... }  exclude[Actor] synthesized type: (a whitebox macro)

Slide 78

Slide 78 text

w/ r C p spores RESEARCH basic composition operators (same as for regular functions) andThen compose How do we synthesize the result type of s1 andThen s2? result type synthesized by andThen macro type member Facts takes “union” of the facts of s1 and s2 type member Excluded: conjunction of excluded types, needs to check Facts to see if possible

Slide 79

Slide 79 text

RESEARCH w/ r E p : spores C p val  s1:  Spore[Int,  String]  {    type  Excluded  =  NoCapture[Actor]    type  Facts  =  Captured[Int]  with  Captured[ActorRef] }  =  ... val  s2:  Spore[String,  String]  {    type  Excluded  =  NoCapture[RDD[Int]]    type  Facts  =  Captured[Actor] } s1  andThen  s2    //  does  not  compile

Slide 80

Slide 80 text

RESEARCH w/ r E p : spores C p val  s1:  Spore[Int,  String]  {    type  Excluded  =  NoCapture[Actor]    type  Facts  =  Captured[Int]  with  Captured[ActorRef] }  =  ... val  s2:  Spore[String,  String]  {    type  Excluded  =  NoCapture[RDD[Int]]    type  Facts  =  Captured[Actor] } s1  andThen  s2    //  does  not  compile

Slide 81

Slide 81 text

RESEARCH w/ r E p : spores C p val  s1:  Spore[Int,  String]  {    type  Excluded  =  NoCapture[Actor]    type  Facts  =  Captured[Int]  with  Captured[ActorRef] }  =  ... val  s2:  Spore[String,  String]  {    type  Excluded  =  NoCapture[RDD[Int]] } s1  andThen  s2:  Spore[Int,  String]  {    type  Excluded  =  NoCapture[Actor]  with   NoCapture[RDD[Int]]    type  Facts  =  Captured[Int]  with  Captured[ActorRef] }

Slide 82

Slide 82 text

what do type constraints buy us? Stronger constraints checked at compile time (not "just" basic spore rules) Frameworks can make stronger assumptions about spores created by users. Confidence in consuming, creating, and composing spores: Constraints accumulate monotonically Constraints are never lost when composing spores Less brittleness.

Slide 83

Slide 83 text

And now onto something completely different.

Slide 84

Slide 84 text

Pickles! .p

Slide 85

Slide 85 text

W ? https://github.com/scala/pickling

Slide 86

Slide 86 text

W ? PICKLING == SERIALIZATION == MARSHALLING https://github.com/scala/pickling

Slide 87

Slide 87 text

W ? PICKLING == SERIALIZATION == MARSHALLING very different from java serialization https://github.com/scala/pickling

Slide 88

Slide 88 text

https://github.com/scala/pickling wait, why do we care?

Slide 89

Slide 89 text

https://github.com/scala/pickling S w! wait, why do we care?

Slide 90

Slide 90 text

https://github.com/scala/pickling C ! S w! wait, why do we care?

Slide 91

Slide 91 text

https://github.com/scala/pickling C ! S w! wait, why do we care? not serializable exceptions at runtime

Slide 92

Slide 92 text

https://github.com/scala/pickling C ! S w! wait, why do we care? not serializable exceptions at runtime can’t retroactively make classes serializable

Slide 93

Slide 93 text

fast: Serialization code generated at compile- time and inlined at the use-site. Flexible: Using typeclass pattern, retroactively make types serializable Typeclass instances generated at compile-time pluggable formats: Effortlessly change format of serialized data: binary, JSON, invent your own! typesafe: Picklers are type-specialized. Catch errors at compile-time! Enter:S P NO BOILERPLATE:

Slide 94

Slide 94 text

https://github.com/scala/pickling

Slide 95

Slide 95 text

https://github.com/scala/pickling W ?

Slide 96

Slide 96 text

https://github.com/scala/pickling W ? scala> import scala.pickling._ import scala.pickling._

Slide 97

Slide 97 text

https://github.com/scala/pickling W ? scala> import scala.pickling._ import scala.pickling._ scala> import json._ import json._

Slide 98

Slide 98 text

https://github.com/scala/pickling W ? scala> import scala.pickling._ import scala.pickling._ scala> import json._ import json._ scala> case class Person(name: String, age: Int) defined class Person scala> Person("John Oliver", 36) res0: Person = Person(John Oliver,36)

Slide 99

Slide 99 text

https://github.com/scala/pickling W ? scala> import scala.pickling._ import scala.pickling._ scala> import json._ import json._ scala> case class Person(name: String, age: Int) defined class Person scala> Person("John Oliver", 36) res0: Person = Person(John Oliver,36) scala> res0.pickle res1: scala.pickling.json.JSONPickle = JSONPickle({ "tpe": "Person", "name": "John Oliver", "age": 36 })

Slide 100

Slide 100 text

and... ’ pr f https://github.com/scala/pickling

Slide 101

Slide 101 text

collections: Time B r

Slide 102

Slide 102 text

B r collections: free Memory (more is better)

Slide 103

Slide 103 text

B r collections: size

Slide 104

Slide 104 text

B r geotrellis: time

Slide 105

Slide 105 text

B r evactor: time Java runs out of memory

Slide 106

Slide 106 text

B r evactor: time (no java, more events)

Slide 107

Slide 107 text

https://github.com/scala/pickling that’s just the b w, default behavior...

Slide 108

Slide 108 text

https://github.com/scala/pickling that’s just the b w, default behavior... you can really customize scala pickling too.

Slide 109

Slide 109 text

Previous examples used default behavior C Pickling is very customizable Generated picklers Standard pickle format Custom picklers for specific types Custom pickle format https://github.com/scala/pickling pickling

Slide 110

Slide 110 text

case class Person(name: String, age: Int, salary: Int) class CustomPersonPickler(implicit val format: PickleFormat) extends SPickler[Person] { def pickle(picklee: Person, builder: PBuilder): Unit = { builder.beginEntry(picklee) builder.putField("name", b => b.hintTag(FastTypeTag.ScalaString).beginEntry(picklee.name).endEntry()) builder.putField("age", b => b.hintTag(FastTypeTag.Int).beginEntry(picklee.age).endEntry()) builder.endEntry() } } implicit def genCustomPersonPickler(implicit format: PickleFormat) = new CustomPersonPickler customize what you pickle! https://github.com/scala/pickling I p picklers

Slide 111

Slide 111 text

output any format! trait PickleFormat { type PickleType <: Pickle def createBuilder(): PBuilder def createReader(pickle: PickleType, mirror: Mirror): PReader } trait PBuilder extends Hintable { def beginEntry(picklee: Any): PBuilder def putField(name: String, pickler: PBuilder => Unit): PBuilder def endEntry(): Unit def beginCollection(length: Int): PBuilder def putElement(pickler: PBuilder => Unit): PBuilder def endCollection(length: Int): Unit def result(): Pickle } https://github.com/scala/pickling F r pickle

Slide 112

Slide 112 text

https://gist.github.com/heathermiller/5760171 example Output edn, Clojure’s data transfer format. talk to a clojure app toy builder implementation: scala> import scala.pickling._ import scala.pickling._ scala> import edn._ import edn._ scala> case class Person(name: String, kidsAges: Array[Int]) defined class Person scala> val joe = Person("Joe", Array(3, 4, 13)) joe: Person = Person(Joe,[I@3d925789) scala> joe.pickle.value res0: String = #pickling/Person { :name "Joe" :kidsAges [3, 4, 13] } F r pickle

Slide 113

Slide 113 text

Scala 2.11 as target G : P : 1.0 release within the next few months SIP for Scala 2.11 Integration with sbt, Spark, and Akka, ... Release 0.8.0 for Scala 2.10.2 No support for inner classes, yet ScalaCheck tests Support for cyclic object graphs, and most Scala types https://github.com/scala/pickling Status

Slide 114

Slide 114 text

qUESTIONS ? heather.miller@epfl.ch @heathercmiller C