Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ScalaMatsuri 2016: scalaz-stream -- a worked example

ScalaMatsuri 2016: scalaz-stream -- a worked example

Mathias Sulser

January 30, 2016
Tweet

More Decks by Mathias Sulser

Other Decks in Programming

Transcript

  1. scalaz-stream
    a worked example
    — Mathias Sulser

    @suls

    View Slide

  2. Who is this guy?
    • Mathias Sulser / εϧαʔɾϚςΟΞε

    @suls github.com/suls
    • Husband of one, father of two
    • Living and working in beautiful Sendai, Japan
    • Speaking: Swiss German, English, 

    a bit Japanese & French

    View Slide

  3. In 30 minutes from now
    • You will know
    • why you would want to use scalaz-stream
    • the basic building blocks of scalaz-stream
    • iteratively built a small report generation
    Ϩϙʔτੜ੒ϓϩάϥϜͷྫΛ௨ͯ͡
    scalaz-stream ͷ࢖͍Ͳ͜Ζͱجຊతͳߏ੒ཁૉΛղઆ

    View Slide

  4. If your program ..
    • consumes data (file, db, network, user input, ..)
    • transforms data
    • produces data (..)
    • runs once, repeatedly or infinitely
    .. then scalaz-streams might be worth looking at.
    σʔλΛফඅɾม׵ɾੜ੒͢ΔϓϩάϥϜΛ࡞ΔͳΒ
    scalaz-stream Λߟྀ͢ΔՁ஋͕͋Δ

    View Slide

  5. The Basics

    View Slide

  6. Process
    Sink
    Task

    View Slide

  7. > res0 to io.stdOutLines
    res1 : Process[Task, Unit]
    > Process("a","b","c")
    res0 : Process[Nothing, String]
    > res1.run
    res2: Task[Unit]
    > res2.run
    a
    b
    c

    View Slide

  8. What just happened?
    • We set up a sequence of computations using
    Process
    • Then we “run” it to get its effect
    • And finally we “run” the effect
    Process Λ࢖ͬͯҰ࿈ͷܭࢉΛ४උ࣮ͯ͠ߦ (run) ͠
    ͦΕʹΑͬͯಘͨ࡞༻Λ࠷ऴతʹ࣮ߦ (run) ͢Δ

    View Slide

  9. > Process.emitAll("sendai")
    res1 : Process[Nothing, Char]
    > Process(1,2) ++ Process(3,4)
    res0 : Process[Nothing, Int]
    > res1.flatMap { in : Char =>
    Process.emitAll('a' to in)

    }
    res2: Process[Nothing, Char]
    > res2.map(_.shows)
    res3: Process[Nothing, String]

    View Slide

  10. More Basics
    • scalaz-stream is pull based
    • Process won’t produce a value until a Sink
    requests one
    • Back-pressure for free
    • Being lazy allows us to create infinite Processes
    • Process can be executed multiple times
    scalaz-stream ͸ϓϧܕͳͷͰɺແݶ Process Λ࡞ͬͨΓɺ
    Process ΛԿ౓΋࣮ߦͨ͠ΓͰ͖Δ

    View Slide

  11. Process
    Sink
    Task

    View Slide

  12. Iteration #1
    generate
    CSV from API
    ͦͷ1: API ͔Β CSV Λੜ੒͢Δ

    View Slide

  13. case class TradeReport(
    datetime: DateTime,
    contract: String,
    lots: Int,
    price: BigDecimal,
    side: Side
    )
    sealed trait Side
    case object Buy extends Side
    case object Sell extends Side
    Our Domain
    ചങϨϙʔτͷυϝΠϯ

    View Slide

  14. val tradeReports =
    pagedRequest(fetchTradeReports(DateTime.now))
    .map(_.productIterator.mkString(","))

    .prepend(Seq("datetime,contract,lots,price,direction"))

    .intersperse("\n")

    .pipe(text.utf8Encode)

    .to(io.fileChunkW("trade_reports.csv"))

    View Slide

  15. def fetchTradeReports(dateTime: DateTime)
    (offset: Offset) : Task[Page[TradeReport]]

    type Offset = Option[Int]

    case class Page(results: Seq[TradeReport], next: Offset)

    def pagedRequest(
    f: Offset => Task[Page[TradeReport]],

    current: Offset
    ): Process[Task, TradeReport]
    Pagination
    ϖʔδॲཧ

    View Slide

  16. def pagedRequest(
    f: Offset => Task[Page[TradeReport]],

    current: Offset = Some(0)
    ): Process[Task, TradeReport] =

    Process

    .eval(f(current))

    .flatMap { response : Page[A] =>

    Process.emitAll(response.results) ++

    response.next.map { o =>

    pagedRequest(f, Option(o))

    }.getOrElse(Process.empty[Task, TradeReport])

    }
    Pagination

    View Slide

  17. val tradeReports =

    pagedRequest(fetchTradeReports(DateTime.now))

    .map(_.productIterator.mkString(","))

    .prepend(Seq("datetime,contract,lots,price,direction"))

    .intersperse("\n")

    .pipe(text.utf8Encode)

    .to(io.fileChunkW("trade_reports.csv"))

    View Slide

  18. Iteration #2
    refactor for purity
    ͦͷ2: ७ਮੑΛϦϑΝΫλϦϯά͢Δ

    View Slide

  19. def pagedRequest[F[_], A](
    f: Offset => F[Page[A]],

    current: Offset
    ): Process[F, A]
    Pure Pagination
    def pagedRequest(
    f: Offset => Task[Page[TradR]],

    current: Offset
    ): Process[Task, TradeReport]
    ७ਮͳϖʔδॲཧ

    View Slide

  20. "pagination is flattening" >>

    prop { (is: List[List[Int]]) =>


    val f: (Offset) => Task[Page[Int]] = // ..

    is.flatten must_== pagedRequest(f, Some(0)).runLog.run

    }.setGen(Gen.nonEmptyListOf(Gen.listOf(Gen.posNum[Int])))
    Pagination
    [info] Finished in 588 ms
    [info] 1 example, 100 expectations, 0 failure, 0 error

    View Slide

  21. def tradeReports[F[_]] (

    f: Offset => F[Page[TradeReport]]

    ) : Process[F, String]


    def writingTo(fileName: String)
    (data: Process[Task, String])



    val program =

    writingTo(
    ”trade_reports.csv") {

    tradeReports(
    fetchTradeReports(
    DateTime.now)
    }
    Split Pure vs. IO
    val tradeReports =

    pagedRequest(fetchT.. 

    .map(_.productIte.. 

    .prepend(Seq("dat.. 

    .intersperse("\n".. 

    .pipe(text.utf8En.. 

    .to(io.fileChunkW..
    ७ਮͳίʔυͱ IO Λ෼ׂ͢Δ

    View Slide

  22. View Slide

  23. Iteration #3
    generate EOD report
    ೔࣍ϨϙʔτΛੜ੒͢Δ

    View Slide

  24. What I haven’t told you
    • Process is a deterministic sequence of actions
    • scalaz-stream provides primitves for non-
    deterministic operations
    • merge operator to combine n Process
    • async.boundedQueue to fan out
    Process ͸ܾఆੑͷΞΫγϣϯྻ͕ͩ
    scalaz-stream ͸ඇܾఆੑԋࢉ΋ఏڙ͢Δ

    View Slide

  25. > val p = Process("a", "b", "c")
    .to(q.enqueue)
    .onComplete(Process eval q.close)
    p : Process[Task, Unit]
    > val q = async.boundedQueue[String](1)
    q: sz.s.async.mutable.Queue[String]
    > val r = q.dequeue to io.stdOutLines
    r: Process[Task, Unit]
    ϑΝϯΞ΢τͷͨΊͷඇಉظ༗ݶΩϡʔ

    View Slide

  26. > Nondeterminism[Task]
    .gatherUnordered(
    p.run :: r.run :: Nil)
    res0: Task[List[Unit]]
    > res0.run
    a
    b
    c
    ඇܾఆੑܭࢉΛूΊΔ

    View Slide

  27. View Slide

  28. val fetcher: Process[Task, Unit] =

    tradeReports(fetchTradeReports(DateTime.now))

    .observe(q2.enqueue)

    .to(q1.enqueue)

    .onComplete(Process eval q1.close)

    .onComplete(Process eval q2.close)
    Fetching only once
    ചങϨϙʔτΛҰ౓͚ͩऔಘ͢Δ

    View Slide

  29. val tradeReportsCsv =

    writingTo("trade_reports.csv") {

    q1.dequeue
    .map(..
    }
    Consume 1/2
    val program =

    writingTo(
    ”trade_reports.csv") {

    tradeReports(
    fetchTradeReports(
    DateTime.now)
    }
    औಘͨ͠ചങϨϙʔτΛফඅ͢Δ

    View Slide

  30. case class EndOfDayPosition(

    contract: Contract,

    position: Int

    )
    val endOfDaySummary = writingTo("summary.csv") {

    q2.dequeue

    .pipe(summarize)

    .map(_.productIterator.mkString(","))

    .prepend(Seq("contract,position"))

    .intersperse("\n")

    }
    val summarize =

    process1.fold(
    // etc.
    )
    Consume 2/2
    ೔࣍ͷ֓ཁΛੜ੒͢Δ

    View Slide

  31. val summarize =

    process1.fold(

    Map

    .empty[Contract, EndOfDayPosition]

    .withDefault(EndOfDayPosition(_, 0))

    ) { (s, tr:TradeReport) =>

    val eod = s(tr.contract)

    s + (tr.contract -> eod.copy(position = tr.side match {

    case Buy => eod.position + tr.lots

    case Sell => eod.position - tr.lots

    }))

    }.flatMap(m => Process.emitAll(m.values.toSeq))
    Compute EoD positions

    View Slide

  32. import shapeless.contrib.scalacheck._


    "summarizing is groupby and sum" >> {

    prop { (trs: List[TradeReport]) => (trs.size > 0) ==> {
    trs
    ... must

    containTheSameElementsAs(

    Process

    .emitAll(trs)

    .pipe(demo.summarize)

    .toList)
    }}
    Testing Proving it ..
    [info] Finished in 671 ms
    [info] 1 example, 100 expectations, 0 failure, 0 error
    ίʔυΛςετ…͡Όͳͯ͘ূ໌͢Δ

    View Slide

  33. val all = Nondeterminism[Task].gatherUnordered(

    fetcher.run ::

    tradeReportsCsv.run ::

    endOfDaySummary.run :: Nil

    )
    The Finale

    View Slide

  34. View Slide

  35. In 30 minutes from now
    • You will know
    • why you would want to use scalaz-stream
    • the basic building blocks of scalaz-stream
    • iteratively built a small report generation
    ͜ΕͰ͋ͳͨ΋ scalaz-stream Λ࢖͍ͨ͘ͳͬͨ͸ͣ

    View Slide

  36. One more thing
    • Before: scalaz-stream
    • Soon: Functional Streams for Scala / fs2
    • library with 0 dependencies
    • Process[F, O] becomes Stream[F, W]
    scalaz-stream ͸ fs2 ʹ໊લ͕มΘͬͨ
    ґଘϥΠϒϥϦ͸θϩ

    View Slide

  37. Q&A / Thank you!

    @suls
    github.com/suls

    View Slide