ScalaMatsuri 2016: scalaz-stream -- a worked example

ScalaMatsuri 2016: scalaz-stream -- a worked example

7d410d9d8bca92eb1372ec56b3ec3ce0?s=128

Mathias Sulser

January 30, 2016
Tweet

Transcript

  1. scalaz-stream a worked example — Mathias Sulser
 @suls

  2. Who is this guy? • Mathias Sulser / εϧαʔɾϚςΟΞε
 @suls

    github.com/suls • Husband of one, father of two • Living and working in beautiful Sendai, Japan • Speaking: Swiss German, English, 
 a bit Japanese & French
  3. In 30 minutes from now • You will know •

    why you would want to use scalaz-stream • the basic building blocks of scalaz-stream • iteratively built a small report generation Ϩϙʔτੜ੒ϓϩάϥϜͷྫΛ௨ͯ͡ scalaz-stream ͷ࢖͍Ͳ͜Ζͱجຊతͳߏ੒ཁૉΛղઆ
  4. If your program .. • consumes data (file, db, network,

    user input, ..) • transforms data • produces data (..) • runs once, repeatedly or infinitely .. then scalaz-streams might be worth looking at. σʔλΛফඅɾม׵ɾੜ੒͢ΔϓϩάϥϜΛ࡞ΔͳΒ scalaz-stream Λߟྀ͢ΔՁ஋͕͋Δ
  5. The Basics

  6. Process Sink Task

  7. > res0 to io.stdOutLines res1 : Process[Task, Unit] > Process("a","b","c")

    res0 : Process[Nothing, String] > res1.run res2: Task[Unit] > res2.run a b c
  8. What just happened? • We set up a sequence of

    computations using Process • Then we “run” it to get its effect • And finally we “run” the effect Process Λ࢖ͬͯҰ࿈ͷܭࢉΛ४උ࣮ͯ͠ߦ (run) ͠ ͦΕʹΑͬͯಘͨ࡞༻Λ࠷ऴతʹ࣮ߦ (run) ͢Δ
  9. > Process.emitAll("sendai") res1 : Process[Nothing, Char] > Process(1,2) ++ Process(3,4)

    res0 : Process[Nothing, Int] > res1.flatMap { in : Char => Process.emitAll('a' to in)
 } res2: Process[Nothing, Char] > res2.map(_.shows) res3: Process[Nothing, String]
  10. More Basics • scalaz-stream is pull based • Process won’t

    produce a value until a Sink requests one • Back-pressure for free • Being lazy allows us to create infinite Processes • Process can be executed multiple times scalaz-stream ͸ϓϧܕͳͷͰɺແݶ Process Λ࡞ͬͨΓɺ Process ΛԿ౓΋࣮ߦͨ͠ΓͰ͖Δ
  11. Process Sink Task

  12. Iteration #1 generate CSV from API ͦͷ1: API ͔Β CSV

    Λੜ੒͢Δ
  13. case class TradeReport( datetime: DateTime, contract: String, lots: Int, price:

    BigDecimal, side: Side ) sealed trait Side case object Buy extends Side case object Sell extends Side Our Domain ചങϨϙʔτͷυϝΠϯ
  14. val tradeReports = pagedRequest(fetchTradeReports(DateTime.now)) .map(_.productIterator.mkString(","))
 .prepend(Seq("datetime,contract,lots,price,direction"))
 .intersperse("\n")
 .pipe(text.utf8Encode)
 .to(io.fileChunkW("trade_reports.csv"))

  15. def fetchTradeReports(dateTime: DateTime) (offset: Offset) : Task[Page[TradeReport]] 
 type Offset

    = Option[Int]
 case class Page(results: Seq[TradeReport], next: Offset) 
 def pagedRequest( f: Offset => Task[Page[TradeReport]],
 current: Offset ): Process[Task, TradeReport] Pagination ϖʔδॲཧ
  16. def pagedRequest( f: Offset => Task[Page[TradeReport]],
 current: Offset = Some(0)

    ): Process[Task, TradeReport] =
 Process
 .eval(f(current))
 .flatMap { response : Page[A] =>
 Process.emitAll(response.results) ++
 response.next.map { o =>
 pagedRequest(f, Option(o))
 }.getOrElse(Process.empty[Task, TradeReport])
 } Pagination
  17. val tradeReports = 
 pagedRequest(fetchTradeReports(DateTime.now))
 .map(_.productIterator.mkString(","))
 .prepend(Seq("datetime,contract,lots,price,direction"))
 .intersperse("\n")
 .pipe(text.utf8Encode)
 .to(io.fileChunkW("trade_reports.csv"))

  18. Iteration #2 refactor for purity ͦͷ2: ७ਮੑΛϦϑΝΫλϦϯά͢Δ

  19. def pagedRequest[F[_], A]( f: Offset => F[Page[A]],
 current: Offset ):

    Process[F, A] Pure Pagination def pagedRequest( f: Offset => Task[Page[TradR]],
 current: Offset ): Process[Task, TradeReport] ७ਮͳϖʔδॲཧ
  20. "pagination is flattening" >>
 prop { (is: List[List[Int]]) =>
 


    val f: (Offset) => Task[Page[Int]] = // .. 
 is.flatten must_== pagedRequest(f, Some(0)).runLog.run 
 }.setGen(Gen.nonEmptyListOf(Gen.listOf(Gen.posNum[Int]))) Pagination [info] Finished in 588 ms [info] 1 example, 100 expectations, 0 failure, 0 error
  21. def tradeReports[F[_]] (
 f: Offset => F[Page[TradeReport]]
 ) : Process[F,

    String]
 
 def writingTo(fileName: String) (data: Process[Task, String])
 
 
 val program =
 writingTo( ”trade_reports.csv") {
 tradeReports( fetchTradeReports( DateTime.now) } Split Pure vs. IO val tradeReports =
 pagedRequest(fetchT.. 
 .map(_.productIte.. 
 .prepend(Seq("dat.. 
 .intersperse("\n".. 
 .pipe(text.utf8En.. 
 .to(io.fileChunkW.. ७ਮͳίʔυͱ IO Λ෼ׂ͢Δ
  22. None
  23. Iteration #3 generate EOD report ೔࣍ϨϙʔτΛੜ੒͢Δ

  24. What I haven’t told you • Process is a deterministic

    sequence of actions • scalaz-stream provides primitves for non- deterministic operations • merge operator to combine n Process • async.boundedQueue to fan out Process ͸ܾఆੑͷΞΫγϣϯྻ͕ͩ scalaz-stream ͸ඇܾఆੑԋࢉ΋ఏڙ͢Δ
  25. > val p = Process("a", "b", "c") .to(q.enqueue) .onComplete(Process eval

    q.close) p : Process[Task, Unit] > val q = async.boundedQueue[String](1) q: sz.s.async.mutable.Queue[String] > val r = q.dequeue to io.stdOutLines r: Process[Task, Unit] ϑΝϯΞ΢τͷͨΊͷඇಉظ༗ݶΩϡʔ
  26. > Nondeterminism[Task] .gatherUnordered( p.run :: r.run :: Nil) res0: Task[List[Unit]]

    > res0.run a b c ඇܾఆੑܭࢉΛूΊΔ
  27. None
  28. val fetcher: Process[Task, Unit] =
 tradeReports(fetchTradeReports(DateTime.now))
 .observe(q2.enqueue)
 .to(q1.enqueue)
 .onComplete(Process eval

    q1.close)
 .onComplete(Process eval q2.close) Fetching only once ചങϨϙʔτΛҰ౓͚ͩऔಘ͢Δ
  29. val tradeReportsCsv =
 writingTo("trade_reports.csv") {
 q1.dequeue .map(.. } Consume 1/2

    val program =
 writingTo( ”trade_reports.csv") {
 tradeReports( fetchTradeReports( DateTime.now) } औಘͨ͠ചങϨϙʔτΛফඅ͢Δ
  30. case class EndOfDayPosition(
 contract: Contract,
 position: Int
 ) val endOfDaySummary

    = writingTo("summary.csv") {
 q2.dequeue
 .pipe(summarize)
 .map(_.productIterator.mkString(","))
 .prepend(Seq("contract,position"))
 .intersperse("\n")
 } val summarize =
 process1.fold( // etc. ) Consume 2/2 ೔࣍ͷ֓ཁΛੜ੒͢Δ
  31. val summarize =
 process1.fold(
 Map
 .empty[Contract, EndOfDayPosition]
 .withDefault(EndOfDayPosition(_, 0))
 )

    { (s, tr:TradeReport) =>
 val eod = s(tr.contract)
 s + (tr.contract -> eod.copy(position = tr.side match {
 case Buy => eod.position + tr.lots
 case Sell => eod.position - tr.lots
 }))
 }.flatMap(m => Process.emitAll(m.values.toSeq)) Compute EoD positions
  32. import shapeless.contrib.scalacheck._
 
 "summarizing is groupby and sum" >> {


    prop { (trs: List[TradeReport]) => (trs.size > 0) ==> { trs ... must
 containTheSameElementsAs(
 Process
 .emitAll(trs)
 .pipe(demo.summarize)
 .toList) }} Testing Proving it .. [info] Finished in 671 ms [info] 1 example, 100 expectations, 0 failure, 0 error ίʔυΛςετ…͡Όͳͯ͘ূ໌͢Δ
  33. val all = Nondeterminism[Task].gatherUnordered(
 fetcher.run ::
 tradeReportsCsv.run ::
 endOfDaySummary.run ::

    Nil
 ) The Finale
  34. None
  35. In 30 minutes from now • You will know •

    why you would want to use scalaz-stream • the basic building blocks of scalaz-stream • iteratively built a small report generation ͜ΕͰ͋ͳͨ΋ scalaz-stream Λ࢖͍ͨ͘ͳͬͨ͸ͣ
  36. One more thing • Before: scalaz-stream • Soon: Functional Streams

    for Scala / fs2 • library with 0 dependencies • Process[F, O] becomes Stream[F, W] scalaz-stream ͸ fs2 ʹ໊લ͕มΘͬͨ ґଘϥΠϒϥϦ͸θϩ
  37. Q&A / Thank you! 
 @suls github.com/suls