$30 off During Our Annual Pro Sale. View Details »

A sky full of streams

Jakub Kozłowski
September 03, 2019

A sky full of streams

Stream processing may sound intimidating, and often unnecessary. There's a reason - many of us rarely (or never) feel like they have an actual need to do it, because the size of their data is never large enough not to fit in memory. But is that the only reason to use stream processing?

As it turns out, most of us stream data on a regular basis, just without being fully aware of it. What is more, having realized that, we can benefit from streaming in all kinds of situations, by decomposing a larger problem into smaller pieces we can reuse and reason about independently.

In this talk, I'll briefly introduce to you fs2 - the functional streaming library for Scala, and its many usecases. You'll see what problems you can solve with it, as well as rough outlines of potential solutions.

We'll also learn a bit about what compositionality means and what makes fs2 a truly compositional library.

Links in the talk:

Compose your program flow with Stream, by Fabio Labella: https://youtu.be/x3GLwl1FxcA
Compositional Programming, by Runar Bjarnason: https://youtu.be/ElLxn_l7P2I
My blog: https://blog.kubukoz.com
Code for this talk: https://github.com/kubukoz/talks/tree/master/sky-full-of-streams
My YT channel: https://www.youtube.com/channel/UCBSRCuGz9laxVv0rAnn2O9Q

Jakub Kozłowski

September 03, 2019
Tweet

More Decks by Jakub Kozłowski

Other Decks in Technology

Transcript

  1. Listening to events (UI, Queue, ...) Scanning paginated results (external

    APIs) Traversing a list multiple times class Traversing( findProject: ProjectId => IO[Project], findIssues: Project => IO[List[IssueId]], findIssue: IssueId => IO[Issue], allProjectIds: IO[List[ProjectId]], me: UserId ) { def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean, excludedIssue: Issue => Boolean ): IO[List[Issue] } = ???
  2. trait Projects { def getPage(afterProject: Option[ProjectId]): IO[List[Project]] } trait Issues

    { def getPage( projectId: ProjectId, afterIssue: Option[IssueId] ): IO[List[Issue]] } class Github(projects: Projects, issues: Issues) { def findFirstWithMatchingIssue( predicate: Issue => Boolean ): IO[Option[Project]] } Listening to events (UI, Queue, ...) Scanning paginated results (external APIs) Traversing a list multiple times = ???
  3. trait Listener { def create[A: Decoder]( handler: A => Unit

    ): Unit } Listening to events (UI, Queue, ...) Scanning paginated results (external APIs) Traversing a list multiple times = ???
  4. Many, many others - TCP connections - HTTP calls -

    Scheduled work - Files, binary data - Websockets
  5. Common core? Potential effects at each step Potential parallelism at

    each step Potentially making control flow decisions
  6. Common core? Handling multiple values Potential effects at each step

    Potential parallelism at each step Potentially making control flow decisions
  7. Common core? Handling multiple values Potential effects at each step

    Potential parallelism at each step Potentially making control flow decisions Resource safety
  8. Common core? Handling multiple values Potential effects at each step

    Potential parallelism at each step Potentially making control flow decisions Resource safety
  9. Common core? Handling multiple values Potential effects at each step

    Potential parallelism at each step Potentially making control flow decisions Resource safety
  10. fs2.Stream[F[_], O] An immutable, lazy value Emits from 0 to

    ∞ values || Fails with a Throwable Can have effects of F
  11. fs2.Stream[F[_], O] An immutable, lazy value Emits from 0 to

    ∞ values || Fails with a Throwable Can have effects of F F[A] = usually IO[A]
  12. fs2.Stream[F[_], O] An immutable, lazy value Emits from 0 to

    ∞ values || Fails with a Throwable Can have effects of F F[A] = usually IO[A] Or, less formally...
  13. Pure streams ✨ fs2.Stream[fs2.Pure, O] package object fs2 { type

    Pure[A] <: Nothing } Thanks to a clever hack:
  14. Pure streams ✨ fs2.Stream[fs2.Pure, O] package object fs2 { type

    Pure[A] <: Nothing } Thanks to a clever hack: And covariance:
  15. Pure streams ✨ fs2.Stream[fs2.Pure, O] package object fs2 { type

    Pure[A] <: Nothing } Thanks to a clever hack: And covariance: final class Stream[+F[_], +O]
  16. Pure streams ✨ fs2.Stream[fs2.Pure, O] package object fs2 { type

    Pure[A] <: Nothing } Thanks to a clever hack: And covariance: final class Stream[+F[_], +O] val pure: Stream[Pure, Int] = Stream(1, 2, 3) val io : Stream[F, Int] = p
  17. Pure streams ✨ fs2.Stream[fs2.Pure, O] package object fs2 { type

    Pure[A] <: Nothing } Thanks to a clever hack: And covariance: final class Stream[+F[_], +O] val pure: Stream[Pure, Int] = Stream(1, 2, 3) val io : Stream[F, Int] = p For any F[_]
  18. Lift a single value Lift a sequence Stream.emit (_: A

    ) Stream.emits(_: Seq[A]) Building fs2 streams
  19. Lift a single value Lift a single effect Lift a

    sequence Stream.emit (_: A ) Stream.emits(_: Seq[A]) Building fs2 streams
  20. Stream.eval (_: F[A]) Lift a single value Lift a single

    effect Lift a sequence Stream.emit (_: A ) Stream.emits(_: Seq[A]) Building fs2 streams
  21. Transforming streams numbers .map(_ % 10 + 1) .flatMap {

    until => Stream.range(0, until) } .evalMap(showOut) val numbers: Stream[IO, Int] = Stream.random[IO] def showOut(i: Int) = IO(println(i))
  22. Transforming streams numbers .map(_ % 10 + 1) .flatMap {

    until => Stream.range(0, until) } .evalMap(showOut) Transform each element Replace element with sub-stream and flatten Transform each element efectfully def showOut(i: Int) = IO(println(i))
  23. Transforming streams numbers .map(_ % 10 + 1) .flatMap {

    until => Stream.range(0, until) } .evalMap(showOut) def showOut(i: Int) = IO(println(i)) //[-467868903, 452477122, 1143039958, ...] //[-2, 3, 9, ...] //[0, 1, 2, // 0, 1, 2, 3, 4, 5, 6, 7, 8, ...]
  24. What actually happens numbers.debug("random") .map(_ % 10 + 1).debug("map") .flatMap

    { until => Stream.range(0, until) }.debug("flatMap") .evalMap(showOut)
  25. What actually happens numbers.debug("random") .map(_ % 10 + 1).debug("map") .flatMap

    { until => Stream.range(0, until) }.debug("flatMap") .evalMap(showOut) random: -467868903 map: -2 random: 452477122 map: 3 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 random: 1143039958 map: 9 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 flatMap: 3 3 flatMap: 4 4 flatMap: 5 5 flatMap: 6 6 flatMap: 7 7 flatMap: 8 8
  26. What actually happens numbers.debug("random") .map(_ % 10 + 1).debug("map") .flatMap

    { until => Stream.range(0, until) }.debug("flatMap") .evalMap(showOut) random: -467868903 map: -2 random: 452477122 map: 3 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 random: 1143039958 map: 9 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 flatMap: 3 3 flatMap: 4 4 flatMap: 5 5 flatMap: 6 6 flatMap: 7 7 flatMap: 8 8
  27. What actually happens random: -467868903 map: -2 random: 452477122 map:

    3 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 random: 1143039958 map: 9 flatMap: 0 0 flatMap: 1 1 flatMap: 2 2 flatMap: 3 3 flatMap: 4 4 flatMap: 5 5 flatMap: 6 6 flatMap: 7 7 flatMap: 8 8 numbers.debug("random") .map(_ % 10 + 1).debug("map") .flatMap { until => Stream.range(0, until) }.debug("flatMap") .evalMap(showOut)
  28. Control flow Stream.bracket { IO { new BufferedReader(new FileReader(new File("./build.sbt")))

    } }(f => IO(f.close())) val file: Stream[IO, BufferedReader] =
  29. Stream.bracket { IO { new BufferedReader(new FileReader(new File("./build.sbt"))) } }(f

    => IO(f.close())) val file: Stream[IO, BufferedReader] = ... val process = file.flatMap { reader => Stream .eval(IO(Option(reader.readLine()))) .repeat .unNoneTerminate }.map(_.length) Stream.sleep_[IO](200.millis) ++ Stream.random[IO].flatMap(Stream.range(0, _)).take(5) ++ Stream(1, 3, 5) ++ Control flow
  30. Stream(1, 3, 5) ++ file.flatMap { reader => Stream .eval(IO(Option(reader.readLine())))

    .repeat .unNoneTerminate }.map(_.length) Stream.sleep_[IO](200.millis) ++ Stream.random[IO].flatMap(Stream.range(0, _)).take(5) ++ 3 known elements 5 ~random elements No elements, 200ms wait Lots of elements (1 per file line) Control flow
  31. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    - emit n values every 1 millisecond - emit n values every 2 milliseconds - emit n values every 4 milliseconds - ... Reset delay every time something is emitted by `resets`
  32. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    = { val showSlowingDown = Stream.eval_(IO(println("---------- Slowing down! ----------"))) val showResetting = IO(println("---------- Resetting delays! ----------")) val delaysExponential: Stream[IO, FiniteDuration] = Stream .iterate(1.millisecond)(_ * 2) .flatMap { Stream.awakeDelay[IO](_).take(n.toLong) ++ showSlowingDown } Stream.eval(MVar.empty[IO, Unit]).flatMap { restart => val delaysUntilReset = delaysExponential.interruptWhen(restart.take.attempt) delaysUntilReset.repeat concurrently resets.evalMap(_ => restart.put(()) *> showResetting) } }
  33. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    = { val showSlowingDown = Stream.eval_(IO(println("---------- Slowing down! ----------"))) val showResetting = IO(println("---------- Resetting delays! ----------")) val delaysExponential: Stream[IO, FiniteDuration] = Stream .iterate(1.millisecond)(_ * 2) .flatMap { Stream.awakeDelay[IO](_).take(n.toLong) ++ showSlowingDown } Stream.eval(MVar.empty[IO, Unit]).flatMap { restart => val delaysUntilReset = delaysExponential.interruptWhen(restart.take.attempt) delaysUntilReset.repeat concurrently resets.evalMap(_ => restart.put(()) *> showResetting) } }
  34. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    = { val showSlowingDown = Stream.eval_(IO(println("---------- Slowing down! ----------"))) val showResetting = IO(println("---------- Resetting delays! ----------")) val delaysExponential: Stream[IO, FiniteDuration] = Stream .iterate(1.millisecond)(_ * 2) .flatMap { Stream.awakeDelay[IO](_).take(n.toLong) ++ showSlowingDown } Stream.eval(MVar.empty[IO, Unit]).flatMap { restart => val delaysUntilReset = delaysExponential.interruptWhen(restart.take.attempt) delaysUntilReset.repeat concurrently ... resets.evalMap(_ => restart.put(()) *> showResetting) } }
  35. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    = { val showSlowingDown = Stream.eval_(IO(println("---------- Slowing down! ----------"))) val showResetting = IO(println("---------- Resetting delays! ----------")) val delaysExponential: Stream[IO, FiniteDuration] = Stream .iterate(1.millisecond)(_ * 2) .flatMap { Stream.awakeDelay[IO](_).take(n.toLong) ++ showSlowingDown } Stream.eval(MVar.empty[IO, Unit]).flatMap { restart => val delaysUntilReset = delaysExponential.interruptWhen(restart.take.attempt) delaysUntilReset.repeat concurrently resets.evalMap(_ => restart.put(()) *> showResetting) } }
  36. def slowDownEveryNTicks( resets: Stream[IO, Unit], n: Int ): Stream[IO, FiniteDuration]

    = { val showSlowingDown = Stream.eval_(IO(println("---------- Slowing down! ----------"))) val showResetting = IO(println("---------- Resetting delays! ----------")) val delaysExponential: Stream[IO, FiniteDuration] = Stream .iterate(1.millisecond)(_ * 2) .flatMap { Stream.awakeDelay[IO](_).take(n.toLong) ++ showSlowingDown } Stream.eval(MVar.empty[IO, Unit]).flatMap { restart => val delaysUntilReset = delaysExponential.interruptWhen(restart.take.attempt) delaysUntilReset.repeat concurrently resets.evalMap(_ => restart.put(()) *> showResetting) } }
  37. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue]] def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue]]
  38. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue]] def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue]]
  39. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue]] val projects: Stream[IO, Project] val issues: Pipe[IO, Project, Issue] def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue]]
  40. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue]] val projects: Stream[IO, Project] val issues: Pipe[IO, Project, Issue] = projects .filter(!excludedProject(_)) .through(issues) .filter(!excludedIssue(_)) .filter(_.creator === me) .compile .toList def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue]]
  41. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue]] val projects: Stream[IO, Project] val issues: Pipe[IO, Project, Issue] = projects .find(inProject) .through(issues) .head.compile.last = projects .filter(!excludedProject(_)) .through(issues) .filter(!excludedIssue(_)) .filter(_.creator === me) .compile .toList def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue]]
  42. Inversion of flow control def allMyIssuesInProjectsWithExclusions( excludedProject: Project => Boolean,

    excludedIssue: Issue => Boolean ): IO[List[Issue] val projects: Stream[IO, Project] val issues: Pipe[IO, Project, Issue] = projects .find(inProject) .through(issues) .head.compile.last = projects .filter(!excludedProject(_)) .through(issues) .filter(!excludedIssue(_)) .filter(_.creator === me) .compile .toList def firstIssue( inProject: Project => Boolean, ): IO[Option[Issue] type Pipe[F[_], -I, +O] = Stream[F, I] => Stream[F, O]
  43. Reusable transformations val filterText: Pipe[IO, Message, TextMessage] = _.evalMap {

    case msg: TextMessage => msg.some.pure[IO] case msg => logger.error(s"Not a text message: $msg").as(none) }.unNone def decode[A: Decoder]: Pipe[IO, String, A] = _.map(io.circe.parser.decode[A](_)).evalMap { case Right(v) => v.some.pure[IO] case Left(e) => logger.error(e)("Decoding error").as(none) }.unNone consumerMessages .through(filterText) .map(_.getText) .through(decode[UserEvent])
  44. Compositionality principle The meaning of an expression is the meaning

    of its parts and the way they are combined together.
  45. Compositionality and referential transparency val program: IO[Unit] = { val

    prog1 = createFork.bracket(commit(data))(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 >> prog2 } val program: IO[Unit] = createFork.bracket(commit(a))(closeFork(_)) >> sendEvent(commitSuccessful(data))
  46. Referential transparency is not always enough val program: IO[Unit] =

    { val prog1 = createFork.bracket(commit(data))(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 >> prog2 }
  47. Referential transparency is not always enough def program(use: Repository =>

    IO[A]): IO[A] = { val prog1 = createFork.bracket(use)(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 <* prog2 }
  48. Referential transparency is not always enough def program(use: Repository =>

    IO[A]): IO[A] = { val prog1 = createFork.bracket(use)(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 <* prog2 }
  49. Referential transparency is not always enough def program(use: Repository =>

    IO[A]): IO[A] = { val prog1 = createFork.bracket(use)(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 <* prog2 } program(quiteLongStream(_).compile.toList):
  50. Referential transparency is not always enough def program(use: Repository =>

    IO[A]): IO[A] = { val prog1 = createFork.bracket(use)(closeFork(_)) val prog2 = sendEvent(commitSuccessful(data)) prog1 <* prog2 } program(quiteLongStream(_).compile.toList):
  51. Streams are self-contained val program: Stream[IO, Repository] = Stream.bracket(createFork)(closeFork(_)) ++

    Stream.eval_(sendEvent(commitSuccessful(data))) program.flatMap(quiteLongStream)
  52. def mostUsefulIssues[F[_]](issueSource: Stream[F, Issue]) = issueSource.filter(_.upvotes > 100) How can

    we test this stream? mostUsefulIssues( Stream(Issue("#1", "/u/root", 90), Issue("#2", "/u/root", 110)) ).toList === List(Issue("#2", "/u/root", 110))
  53. fs2 is truly compositional - No matter where the data

    comes from, it's always the same abstraction
  54. fs2 is truly compositional - No matter where the data

    comes from, it's always the same abstraction - All combinators work on all the streams of compatible types
  55. fs2 is truly compositional - No matter where the data

    comes from, it's always the same abstraction - All combinators work on all the streams of compatible types - Stream scales from pure streams to complex processes with lots of resources and concurrency
  56. fs2 is truly compositional - No matter where the data

    comes from, it's always the same abstraction - All combinators work on all the streams of compatible types - Stream scales from pure streams to complex processes with lots of resources and concurrency - List + IO on superpowers with far less responsibility for you
  57. val converter: Stream[IO, Unit] = Stream.resource(Blocker[IO]).flatMap { blocker => def

    fahrenheitToCelsius(f: Double): Double = (f - 32.0) * (5.0/9.0) io.file.readAll[IO](Paths.get("testdata/fahrenheit.txt"), blocker, 4096) .through(text.utf8Decode) .through(text.lines) .filter(s => !s.trim.isEmpty && !s.startsWith("//")) .map(line => fahrenheitToCelsius(line.toDouble).toString) .intersperse("\n") .through(text.utf8Encode) .through(io.file.writeAll(Paths.get("testdata/celsius.txt"), blocker)) }
  58. def server( blocker: Blocker ): Stream[IO, Resource[IO, fs2.io.tcp.Socket[IO]]] = Stream.resource(fs2.io.tcp.SocketGroup[IO](blocker)).flatMap

    { group => group.server[IO]( new InetSocketAddress("0.0.0.0", 8080) ) } val clientMessages = Stream .resource(Blocker[IO]) .flatMap(server) .map { Stream .resource(_) .flatMap(_.reads(1024)) .through(fs2.text.utf8Decode) .through(fs2.text.lines) .map("Message: " + _) } .parJoin(maxOpen = 10)
  59. Acknowledgements Thanks to Fabio Labella for help with this talk!

    Thanks to Michael Pilquist, Paul Chiusano, Pavel Chlupacek, Fabio and all the contributors of fs2 and cats-effect/cats Massively inspired by: - Declarative control flow with fs2 streams by Fabio Labella: https://www.youtube.com/watch?v=x3GLwl1FxcA - Compositional Programming by Runar Bjarnason: https://www.youtube.com/watch?v=ElLxn_l7P2I