Slide 1

Slide 1 text

ScalaBlitz Efficient Collections Framework

Slide 2

Slide 2 text

What’s a Blitz?

Slide 3

Slide 3 text

Blitz-chess is a style of rapid chess play.

Slide 4

Slide 4 text

Blitz-chess is a style of rapid chess play.

Slide 5

Slide 5 text

Knights have horses.

Slide 6

Slide 6 text

Horses run fast.

Slide 7

Slide 7 text

def mean(xs: Array[Float]): Float = xs.par.reduce(_ + _) / xs.length

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

With Lists, operations can only be executed from left to right

Slide 11

Slide 11 text

1 2 4 8

Slide 12

Slide 12 text

1 2 4 8 Not your typical list.

Slide 13

Slide 13 text

Bon app.

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Apparently not enough

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No amount of documentation is apparently enough

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

The reduceLeft guarantees operations are executed from left to right

Slide 23

Slide 23 text

Parallel and sequential collections sharing operations

Slide 24

Slide 24 text

There are several problems here

Slide 25

Slide 25 text

How we see users

Slide 26

Slide 26 text

How users see the docs

Slide 27

Slide 27 text

Bending the truth.

Slide 28

Slide 28 text

And sometimes we were just slow

Slide 29

Slide 29 text

So, we have a new API now def findDoe(names: Array[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 30

Slide 30 text

Wait, you renamed a method? def findDoe(names: Array[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 31

Slide 31 text

Yeah, par already exists. But, toPar is different. def findDoe(names: Array[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 32

Slide 32 text

def findDoe(names: Array[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) } implicit class ParOps[Repr](val r: Repr) extends AnyVal { def toPar = new Par(r) }

Slide 33

Slide 33 text

def findDoe(names: Array[String]): Option[String] = { ParOps(names).toPar.find(_.endsWith(“Doe”)) } implicit class ParOps[Repr](val r: Repr) extends AnyVal { def toPar = new Par(r) }

Slide 34

Slide 34 text

def findDoe(names: Array[String]): Option[String] = { ParOps(names).toPar.find(_.endsWith(“Doe”)) } implicit class ParOps[Repr](val r: Repr) extends AnyVal { def toPar = new Par(r) } class Par[Repr](r: Repr)

Slide 35

Slide 35 text

def findDoe(names: Array[String]): Option[String] = { (new Par(names)).find(_.endsWith(“Doe”)) } implicit class ParOps[Repr](val r: Repr) extends AnyVal { def toPar = new Par(r) } class Par[Repr](r: Repr)

Slide 36

Slide 36 text

def findDoe(names: Array[String]): Option[String] = { (new Par(names)).find(_.endsWith(“Doe”)) } class Par[Repr](r: Repr) But, Par[Repr] does not have the find method!

Slide 37

Slide 37 text

True, but Par[Array[String]] does have a find method. def findDoe(names: Array[String]): Option[String] = { (new Par(names)).find(_.endsWith(“Doe”)) } class Par[Repr](r: Repr)

Slide 38

Slide 38 text

def findDoe(names: Array[String]): Option[String] = { (new Par(names)).find(_.endsWith(“Doe”)) } class Par[Repr](r: Repr) implicit class ParArrayOps[T](pa: Par[Array[T]]) { ... def find(p: T => Boolean): Option[T] ... }

Slide 39

Slide 39 text

More flexible!

Slide 40

Slide 40 text

More flexible! ● does not have to implement methods that make no sense in parallel

Slide 41

Slide 41 text

More flexible! ● does not have to implement methods that make no sense in parallel ● slow conversions explicit

Slide 42

Slide 42 text

No standard library collections were hurt doing this. No standard library collections were hurt doing this.

Slide 43

Slide 43 text

More flexible! ● does not have to implement methods that make no sense in parallel ● slow conversions explicit ● non-intrusive addition to standard library

Slide 44

Slide 44 text

More flexible! ● does not have to implement methods that make no sense in parallel ● slow conversions explicit ● non-intrusive addition to standard library ● easy to add new methods and collections

Slide 45

Slide 45 text

More flexible! ● does not have to implement methods that make no sense in parallel ● slow conversions explicit ● non-intrusive addition to standard library ● easy to add new methods and collections ● import switches between implementations

Slide 46

Slide 46 text

def findDoe(names: Seq[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 47

Slide 47 text

def findDoe(names: Seq[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 48

Slide 48 text

But how do I write generic code? def findDoe(names: Seq[String]): Option[String] = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 49

Slide 49 text

def findDoe[Repr[_]](names: Par[Repr[String]]) = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 50

Slide 50 text

Par[Repr[String]]does not have a find def findDoe[Repr[_]](names: Par[Repr[String]]) = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 51

Slide 51 text

def findDoe[Repr[_]: Ops](names: Par[Repr[String]]) = { names.toPar.find(_.endsWith(“Doe”)) }

Slide 52

Slide 52 text

def findDoe[Repr[_]: Ops](names: Par[Repr[String]]) = { names.toPar.find(_.endsWith(“Doe”)) } We don’t do this.

Slide 53

Slide 53 text

Make everything as simple as possible, but not simpler.

Slide 54

Slide 54 text

def findDoe(names: Reducable[String])= { names.find(_.endsWith(“Doe”)) }

Slide 55

Slide 55 text

def findDoe(names: Reducable[String])= { names.find(_.endsWith(“Doe”)) } findDoe(Array(1, 2, 3).toPar)

Slide 56

Slide 56 text

def findDoe(names: Reducable[String])= { names.find(_.endsWith(“Doe”)) } findDoe(toReducable(Array(1, 2, 3).toPar))

Slide 57

Slide 57 text

def findDoe(names: Reducable[String])= { names.find(_.endsWith(“Doe”)) } findDoe(toReducable(Array(1, 2, 3).toPar)) def arrayIsReducable[T]: IsReducable[T] = { … }

Slide 58

Slide 58 text

So let’s write a program!

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

import scala.collection.par._ val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { }

Slide 61

Slide 61 text

import scala.collection.par._ val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { val x = idx % wdt val y = idx / wdt }

Slide 62

Slide 62 text

import scala.collection.par._ val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { val x = idx % wdt val y = idx / wdt pixels(idx) = computeColor(x, y) }

Slide 63

Slide 63 text

import scala.collection.par._ val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { val x = idx % wdt val y = idx / wdt pixels(idx) = computeColor(x, y) } Scheduler not found!

Slide 64

Slide 64 text

import scala.collection.par._ import Scheduler.Implicits.global val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { val x = idx % wdt val y = idx / wdt pixels(idx) = computeColor(x, y) }

Slide 65

Slide 65 text

import scala.collection.par._ import Scheduler.Implicits.global val pixels = new Array[Int](wdt * hgt) for (idx <- (0 until (wdt * hgt)).toPar) { val x = idx % wdt val y = idx / wdt pixels(idx) = computeColor(x, y) }

Slide 66

Slide 66 text

New parallel collections 33% faster! Now 103 ms Previously 148 ms

Slide 67

Slide 67 text

Workstealing tree scheduler rocks!

Slide 68

Slide 68 text

Workstealing tree scheduler rocks! But, are there other interesting

Slide 69

Slide 69 text

Fine-grained uniform workloads are on the opposite side of the spectrum.

Slide 70

Slide 70 text

def mean(xs: Array[Float]): Float = { val sum = xs.toPar.fold(0)(_ + _) sum / xs.length }

Slide 71

Slide 71 text

def mean(xs: Array[Float]): Float = { val sum = xs.toPar.fold(0)(_ + _) sum / xs.length } Now 15 ms Previously 565 ms

Slide 72

Slide 72 text

But how?

Slide 73

Slide 73 text

def fold[T](a: Iterable[T])(z:T)(op: (T, T) => T) = { var it = a.iterator var acc = z while (it.hasNext) { acc = op(acc, it.next) } acc }

Slide 74

Slide 74 text

def fold[T](a: Iterable[T])(z:T)(op: (T, T) => T) = { var it = a.iterator var acc = z while (it.hasNext) { acc = box(op(acc, it.next)) } acc }

Slide 75

Slide 75 text

def fold[T](a: Iterable[T])(z:T)(op: (T, T) => T) = { var it = a.iterator var acc = z while (it.hasNext) { acc = box(op(acc, it.next)) } acc } Generic methods cause boxing of primitives

Slide 76

Slide 76 text

def mean(xs: Array[Float]): Float = { val sum = xs.toPar.fold(0)(_ + _) sum / xs.length }

Slide 77

Slide 77 text

def mean(xs: Array[Float]): Float = { val sum = xs.toPar.fold(0)(_ + _) sum / xs.length } Generic methods hurt performance What can we do instead?

Slide 78

Slide 78 text

def mean(xs: Array[Float]): Float = { val sum = xs.toPar.fold(0)(_ + _) sum / xs.length } Generic methods hurt performance What can we do instead? Inline method body!

Slide 79

Slide 79 text

def mean(xs: Array[Float]): Float = { val sum = { var it = xs.iterator var acc = 0 while (it.hasNext) { acc = acc + it.next } acc } sum / xs.length }

Slide 80

Slide 80 text

def mean(xs: Array[Float]): Float = { val sum = { var it = xs.iterator var acc = 0 while (it.hasNext) { acc = acc + it.next } acc } sum / xs.length } Specific type No boxing! No memory allocation!

Slide 81

Slide 81 text

def mean(xs: Array[Float]): Float = { val sum = { var it = xs.iterator var acc = 0 while (it.hasNext) { acc = acc + it.next } acc } sum / xs.length } Specific type No boxing! No memory allocation! 2X speedup 565 ms → 281 ms

Slide 82

Slide 82 text

def mean(xs: Array[Float]): Float = { val sum = { var it = xs.iterator var acc = 0 while (it.hasNext) { acc = acc + it.next } acc } sum / xs.length }

Slide 83

Slide 83 text

def mean(xs: Array[Float]): Float = { val sum = { var it = xs.iterator var acc = 0 while (it.hasNext) { acc = acc + it.next } acc } sum / xs.length } Iterators? For Array? We don’t need them!

Slide 84

Slide 84 text

def mean(xs: Array[Float]): Float = { val sum = { var i = 0 val until = xs.size var acc = 0 while (i < until) { acc = acc + a(i) i = i + 1 } acc } sum / xs.length } Use index-based access!

Slide 85

Slide 85 text

def mean(xs: Array[Float]): Float = { val sum = { var i = 0 val until = xs.size var acc = 0 while (i < until) { acc = acc + a(i) i = i + 1 } acc } sum / xs.length } 19x speedup Use index-based access! 281 ms → 15 ms

Slide 86

Slide 86 text

Are those optimizations parallel-collections specific?

Slide 87

Slide 87 text

Are those optimizations parallel-collections specific? No

Slide 88

Slide 88 text

Are those optimizations parallel-collections specific? No You can use them on sequential collections

Slide 89

Slide 89 text

def mean(xs: Array[Float]): Float = { val sum = xs.fold(0)(_ + _) sum / xs.length }

Slide 90

Slide 90 text

import scala.collections.optimizer._ def mean(xs: Array[Float]): Float = optimize{ val sum = xs.fold(0)(_ + _) sum / xs.length }

Slide 91

Slide 91 text

import scala.collections.optimizer._ def mean(xs: Array[Float]): Float = optimize{ val sum = xs.fold(0)(_ + _) sum / xs.length } You get 38 times speedup!

Slide 92

Slide 92 text

Future work

Slide 93

Slide 93 text

@specialized collections ● Maps ● Sets ● Lists ● Vectors Both faster & consuming less memory

Slide 94

Slide 94 text

@specialized collections ● Maps ● Sets ● Lists ● Vectors Both faster & consuming less memory Expect to get this for free inside optimize{} block

Slide 95

Slide 95 text

jdk8-style streams(parallel views) ● Fast ● Lightweight ● Expressive API ● Optimized Lazy data-parallel operations made easy

Slide 96

Slide 96 text

Future’s based asynchronous API val sum = future{ xs.sum } val normalized = sum.andThen(sum => sum/xs.size) Boilerplate code, ugly

Slide 97

Slide 97 text

Future’s based asynchronous API val sum = xs.toFuture.sum val scaled = xs.map(_ / sum) ● Simple to use ● Lightweight ● Expressive API ● Optimized Asynchronous dat parallel operations made easy

Slide 98

Slide 98 text

Current research: operation fusion val minMaleAge = people.filter(_.isMale) .map(_.age).min val minFemaleAge = people.filter(_.isFemale) .map(_.age).min

Slide 99

Slide 99 text

Current research: operation fusion val minMaleAge = people.filter(_.isMale) .map(_.age).min val minFemaleAge = people.filter(_.isFemale) .map(_.age).min ● Requires up to 3 times more memory than original collection ● Requires 6 traversals of collections

Slide 100

Slide 100 text

Current research: operation fusion val minMaleAge = people.filter(_.isMale) .map(_.age).min val minFemaleAge = people.filter(_.isFemale) .map(_.age).min ● Requires up to 3 times more memory than original collection ● Requires 6 traversals of collections We aim to reduce this to single traversal with no additional memory. Without you changing your code