FS2: Internals

FS2: Internals

Presented at Scale By The Bay on November 17, 2017

C9ab1175a6981a2f67ce8d08aa17c15a?s=128

Michael Pilquist

November 18, 2017
Tweet

Transcript

  1. 2.

    Functional Streams for Scala 0.10 2 libraryDependencies += "co.fs2" %%

    "fs2-core" % "0.10.0-M8" Simplify Improve Performance Support Ecosystem
  2. 4.

    API Example: Exponential Backoff 4 val poll: IO[Message] = IO(…)

    val latestMessage: Stream[IO, Option[Message]] = ??? // Poll once an hour; if poll fails, retry with an exponential delay // Stream should emit last successfully polled message whenever pulled
  3. 5.

    API Example: Exponential Backoff 5 val poll: IO[Message] = IO(…)

    val latestMessage: Stream[IO, Option[Message]] = Scheduler[IO](2).flatMap { scheduler  ??? }
  4. 6.

    API Example: Exponential Backoff 6 val poll: IO[Message] = IO(…)

    val latestMessage: Stream[IO, Option[Message]] = Scheduler[IO](2).flatMap { scheduler  val retryPoll: Stream[IO, Message] = scheduler.retry(poll, 1.second, _ * 2, maxRetries = Int.MaxValue) ??? }
  5. 7.

    API Example: Exponential Backoff 7 val poll: IO[Message] = IO(…)

    val latestMessage: Stream[IO, Option[Message]] = Scheduler[IO](2).flatMap { scheduler  val retryPoll: Stream[IO, Message] = scheduler.retry(poll, 1.second, _ * 2, maxRetries = Int.MaxValue) val repeatPoll: Stream[IO, Message] = (retryPoll  scheduler.sleep_[IO](1.hour)).repeat ??? }
  6. 8.

    API Example: Exponential Backoff 8 val poll: IO[Message] = IO(…)

    val latestMessage: Stream[IO, Option[Message]] = Scheduler[IO](2).flatMap { scheduler  val retryPoll: Stream[IO, Message] = scheduler.retry(poll, 1.second, _ * 2, maxRetries = Int.MaxValue) val repeatPoll: Stream[IO, Message] = (retryPoll  scheduler.sleep_[IO](1.hour)).repeat async.holdOption(repeatPoll).flatMap(_.continuous) }
  7. 9.

    Task vs IO 9 • Deleted fs2.Task • Replaced by

    cats.effect.IO • FS2 polymorphic in effect though!
  8. 10.

    0.9 Type Classes 10 Functor Applicative Monad Traverse Catchable Suspendable

    Effect Async 10 • fs2.util defined major FP type classes • Bidirectional shims to Cats and Scalaz • Shims in separate libraries • Minimal • no Eq, Semigroup, Monoid • Error prone and confusing
  9. 11.

    0.10 Type Classes 11 Functor Applicative Monad Traverse MonadError LiftIO

    Sync Async 11 • Core type classes from cats-core • Effect type classes from cats-effect • No type classes from fs2 • Use of Eq, Semigroup, Monoid, Show, ~>, Eval • fs2.async built on Effect and ExecutionContext • Lock-free concurrent data structures with referential transparency • MVar-like Ref[F,A] • Strategy is gone Effect
  10. 12.

    12 BoundedSemilattice CommutativeGroup Alternative Applicative ApplicativeError Apply Bifoldable Bimonad Bitraverse

    Cartesian CoflatMap Comonad ContravariantCartesian FlatMap Foldable Functor Inject InvariantMonoidal Monad MonadError MonoidK NotNull Reducible SemigroupK Show Bifunctor Contravariant Invariant Profunctor Strong Traverse Arrow Category Choice Compose a |@| b a *> b a <* b a <+> b a >>> b a <<< b Sync Async Effect LiftIO NonEmptyTraverse InjectK CommutativeArrow CommutativeFlatMap CommutativeMonad Cats Infographic by @tpolecat, https://github.com/tpolecat/cats-infographic/blob/master/cats.pdf, CC-BY-SA 4.0
  11. 13.

    Variance Tricks 13 src.evalMap[Task, Task, Foo](…) def evalMap[G[_],Lub[_],O2]( f: O

     G[O2])(implicit L: Lub1[F,G,Lub] ): Stream[Lub,O2] = 0.9 def evalMap[O2](f: O  F[O2]): Stream[F,O2] = 0.10 (this is mostly true)
  12. 14.

    Less meaningless choice, more Scala like 14 src.through(pipe.filter(isEven)) src.filter(isEven) src.through(pipe.unNoneTerminate)

    concurrent.join(4)(streams) 0.9 src.filter(isEven) src.unNoneTerminate streams.join(4) 0.10
  13. 15.

    Segments 15 abstract class Segment[+O,+R] { def unconsChunk: Either[R,(Chunk[O],Segment[O,R])] }

    abstract class Chunk[+O] extends Segment[O,Unit] { def size: Int def apply(i: Int): O } Potentially infinite, pure sequence of values of type O and a result of type R
  14. 16.

    Why Segments? 16 Key insight: much of the work in

    a Stream happens as successive transformations of pure data Open File Read map map filter flatMap Read filter Close • Need operator fusion to avoid intermediate chunks • Segment fuses all operations via staging • Constructors & operations are implemented via anonymous subtypes of Segment which implement the stage0 method • Evaluators (e.g., run, toList, splitAt, unconsChunk) first stage the segment and then step through the staged machine until done
  15. 17.
  16. 18.

    Streams & Pulls 18 Unified under a single algebra class

    Stream[+F[_],+O]( val free: FreeC[Algebra[F,O,?],Unit] ) class Pull[+F[_],+O,+R]( val free: FreeC[Algebra[F,O,?],R] )
  17. 19.

    Free with a catch 19 sealed abstract class FreeC[F[_], +R]

    { def flatMap[R2](f: R  FreeC[F, R2]): FreeC[F, R2] def onError[R2>:R](h: Throwable  FreeC[F,R2]): FreeC[F,R2] } case class Pure[F[_], R](r: R) extends FreeC[F, R] case class Eval[F[_], R](fr: F[R]) extends FreeC[F, R] case class Bind[F[_], X, R](fx: FreeC[F, X], f: Either[Throwable,X]  FreeC[F, R]) extends FreeC[F, R] case class Fail[F[_], R](error: Throwable) extends FreeC[F,R] • Free monad with built-in exception handling • Supports explicit failures (via Fail) and exceptions thrown from pure functions (e.g., from flatMap’s f or onError’s h)
  18. 20.

    Core Algebra 20 sealed trait Algebra[F[_],O,R] case class Output[F[_],O](s: Segment[O,Unit])

    extends Algebra[F,O,Unit] case class Run[F[_],O,R](s: Segment[O,R]) extends Algebra[F,O,R] case class Eval[F[_],O,R](fr: F[R]) extends Algebra[F,O,R] case class Acquire[F[_],O,R](resource: F[R], release: R  F[Unit])
 extends Algebra[F,O,(R,Token)] case class Release[F[_],O](token: Token) extends Algebra[F,O,Unit] case class OpenScope[F[_],O]() extends Algebra[F,O,Scope[F]] case class CloseScope[F[_],O](s: Scope[F]) extends Algebra[F,O,Unit] case class UnconsAsync[F[_],X,Y,O](s: FreeC[Algebra[F,O,?],Unit], ec: ExecutionContext) extends Algebra[F,X,AsyncPull[…]]
  19. 21.

    Four Takes 21 def take1[F[_],O](n: Long): Pipe[F,O,O] = { def

    loop(s: Stream[F,O], n: Long): Pull[F,O,Unit] = { if (n <= 0) Pull.done else s.pull.uncons1.flatMap { case Some((hd,tl))  Pull.output1(hd)  loop(tl, n - 1) case None  Pull.done } } in  loop(in,n).stream } Take 1: Recurse on each element of the Stream
  20. 22.

    Four Takes 22 def take2[F[_],O](n: Long): Pipe[F,O,O] = { def

    loop(s: Stream[F,O], n: Long): Pull[F,O,Unit] = { if (n <= 0) Pull.done else s.pull.unconsChunk.flatMap { case Some((hd,tl))  if (hd.size < n) Pull.output(hd)  loop(tl, n - hd.size) else Pull.output(hd.strict.take(n)) case None  Pull.done } } in  loop(in,n).stream } Take 2: Recurse on each Chunk of the Stream
  21. 23.

    Four Takes 23 def take3[F[_],O](n: Long): Pipe[F,O,O] = in 

    in.scanSegmentsOpt(n) { n  if (n <= 0) None else Some(seg  seg.take(n).mapResult { case Left((_,n))  n case Right(_)  0 }) } Take 3: Recurse on each Segment of the Stream
  22. 24.

    Four Takes 24 ops/sec Element Take Chunk Take Segment Take

    0.9 Take Array 16,147.299 ± 321.075 896,624.227 ± 60,502.001 51,710.996 ± 1,847.084 220,837.799 ± 3,926.547 Segment unfoldChunk 4 8,908.767 ± 213.632 50,088.708 ± 1,773.925 134,921.777 ± 19,311.849 N/A Segment unfoldChunk 32 8,451.779
 ± 200.464 261,149.659 ± 12,810.773 330,209.773 ± 21,180.961 N/A Segment unfoldChunk 256 8,163.377
 ± 255.218 529,161.238 ± 17,859.538 208,547.745 ± 46,435.262 N/A Stream unfold 1 13,073.909 ± 437.835 13,145.540 ± 1,434.234 49,100.455 ± 2,594.901 1,444.025 ± 59.796 Stream unfoldChunk 32 13,198.653 ± 1,925.924 210,085.262 ± 4,625.668 162,318.774 ± 4,696.607 39,607.407 ± 1,816.945 Stream unfoldChunk 256 15,736.231 ± 600.577 545,647.855 ± 106,543.254 223,550.186 ± 3,488.285 135,472.721 ± 2,972.956 jmh:run -i 5 -wi 5 -f 2 -t 4 bench.Take 4GHz Intel i7 quad core take(300)
  23. 26.

    FS2 Internals • FS2 0.10 is significantly faster than 0.9

    • Write pipes with uncons and unconsChunk • Generate segmented or chunky streams • Buffer unitary streams • FS2 Migration Guide. https://github.com/functional-streams-for-scala/fs2/blob/series/0.10/docs/ migration-guide-0.10.md • Stream Fusion, to Completeness. Oleg Kiselyov, Aggelos Biboudis, Nick Palladinos, Yannis Smaragdakis. https://arxiv.org/pdf/1612.06668v1.pdf • Special thanks to Paul Chiusano • Contact me @mpilquist 26