Mathias Sulser
January 30, 2016
3.8k

# ScalaMatsuri 2016: scalaz-stream -- a worked example

January 30, 2016

## Transcript

1. scalaz-stream
a worked example
— Mathias Sulser
@suls

2. Who is this guy?
• Mathias Sulser / εϧαʔɾϚςΟΞε
@suls github.com/suls
• Husband of one, father of two
• Living and working in beautiful Sendai, Japan
• Speaking: Swiss German, English,
a bit Japanese & French

3. In 30 minutes from now
• You will know
• why you would want to use scalaz-stream
• the basic building blocks of scalaz-stream
• iteratively built a small report generation
Ϩϙʔτੜ੒ϓϩάϥϜͷྫΛ௨ͯ͡
scalaz-stream ͷ࢖͍Ͳ͜Ζͱجຊతͳߏ੒ཁૉΛղઆ

• consumes data (ﬁle, db, network, user input, ..)
• transforms data
• produces data (..)
• runs once, repeatedly or inﬁnitely
.. then scalaz-streams might be worth looking at.
σʔλΛফඅɾม׵ɾੜ੒͢ΔϓϩάϥϜΛ࡞ΔͳΒ
scalaz-stream Λߟྀ͢ΔՁ஋͕͋Δ

5. The Basics

6. Process
Sink

7. > res0 to io.stdOutLines
> Process("a","b","c")
res0 : Process[Nothing, String]
> res1.run
> res2.run
a
b
c

8. What just happened?
• We set up a sequence of computations using
Process
• Then we “run” it to get its effect
• And ﬁnally we “run” the effect
Process Λ࢖ͬͯҰ࿈ͷܭࢉΛ४උ࣮ͯ͠ߦ (run) ͠
ͦΕʹΑͬͯಘͨ࡞༻Λ࠷ऴతʹ࣮ߦ (run) ͢Δ

9. > Process.emitAll("sendai")
res1 : Process[Nothing, Char]
> Process(1,2) ++ Process(3,4)
res0 : Process[Nothing, Int]
> res1.flatMap { in : Char =>
Process.emitAll('a' to in)
}
res2: Process[Nothing, Char]
> res2.map(_.shows)
res3: Process[Nothing, String]

10. More Basics
• scalaz-stream is pull based
• Process won’t produce a value until a Sink
requests one
• Being lazy allows us to create inﬁnite Processes
• Process can be executed multiple times
scalaz-stream ͸ϓϧܕͳͷͰɺແݶ Process Λ࡞ͬͨΓɺ
Process ΛԿ౓΋࣮ߦͨ͠ΓͰ͖Δ

11. Process
Sink

12. Iteration #1
generate
CSV from API
ͦͷ1: API ͔Β CSV Λੜ੒͢Δ

datetime: DateTime,
contract: String,
lots: Int,
price: BigDecimal,
side: Side
)
sealed trait Side
case object Sell extends Side
Our Domain
ചങϨϙʔτͷυϝΠϯ

.map(_.productIterator.mkString(","))
.prepend(Seq("datetime,contract,lots,price,direction"))
.intersperse("\n")
.pipe(text.utf8Encode)

type Offset = Option[Int]
case class Page(results: Seq[TradeReport], next: Offset)

def pagedRequest(
current: Offset
Pagination
ϖʔδॲཧ

16. def pagedRequest(
current: Offset = Some(0)
Process
.eval(f(current))
.flatMap { response : Page[A] =>
Process.emitAll(response.results) ++
response.next.map { o =>
pagedRequest(f, Option(o))
}
Pagination

.map(_.productIterator.mkString(","))
.prepend(Seq("datetime,contract,lots,price,direction"))
.intersperse("\n")
.pipe(text.utf8Encode)

18. Iteration #2
refactor for purity
ͦͷ2: ७ਮੑΛϦϑΝΫλϦϯά͢Δ

19. def pagedRequest[F[_], A](
f: Offset => F[Page[A]],
current: Offset
): Process[F, A]
Pure Pagination
def pagedRequest(
current: Offset
७ਮͳϖʔδॲཧ

20. "pagination is flattening" >>
prop { (is: List[List[Int]]) =>

val f: (Offset) => Task[Page[Int]] = // ..

is.flatten must_== pagedRequest(f, Some(0)).runLog.run

}.setGen(Gen.nonEmptyListOf(Gen.listOf(Gen.posNum[Int])))
Pagination
[info] Finished in 588 ms
[info] 1 example, 100 expectations, 0 failure, 0 error

) : Process[F, String]

def writingTo(fileName: String)

val program =
writingTo(
DateTime.now)
}
Split Pure vs. IO
pagedRequest(fetchT..
.map(_.productIte..
.prepend(Seq("dat..
.intersperse("\n"..
.pipe(text.utf8En..
.to(io.fileChunkW..
७ਮͳίʔυͱ IO Λ෼ׂ͢Δ

22. Iteration #3
generate EOD report
೔࣍ϨϙʔτΛੜ੒͢Δ

23. What I haven’t told you
• Process is a deterministic sequence of actions
• scalaz-stream provides primitves for non-
deterministic operations
• merge operator to combine n Process
• async.boundedQueue to fan out
Process ͸ܾఆੑͷΞΫγϣϯྻ͕ͩ
scalaz-stream ͸ඇܾఆੑԋࢉ΋ఏڙ͢Δ

24. > val p = Process("a", "b", "c")
.to(q.enqueue)
.onComplete(Process eval q.close)
> val q = async.boundedQueue[String](1)
q: sz.s.async.mutable.Queue[String]
> val r = q.dequeue to io.stdOutLines
ϑΝϯΞ΢τͷͨΊͷඇಉظ༗ݶΩϡʔ

.gatherUnordered(
p.run :: r.run :: Nil)
> res0.run
a
b
c
ඇܾఆੑܭࢉΛूΊΔ

26. val fetcher: Process[Task, Unit] =
.observe(q2.enqueue)
.to(q1.enqueue)
.onComplete(Process eval q1.close)
.onComplete(Process eval q2.close)
Fetching only once
ചങϨϙʔτΛҰ౓͚ͩऔಘ͢Δ

q1.dequeue
.map(..
}
Consume 1/2
val program =
writingTo(
DateTime.now)
}
औಘͨ͠ചങϨϙʔτΛফඅ͢Δ

28. case class EndOfDayPosition(
contract: Contract,
position: Int
)
val endOfDaySummary = writingTo("summary.csv") {
q2.dequeue
.pipe(summarize)
.map(_.productIterator.mkString(","))
.prepend(Seq("contract,position"))
.intersperse("\n")
}
val summarize =
process1.fold(
// etc.
)
Consume 2/2
೔࣍ͷ֓ཁΛੜ੒͢Δ

29. val summarize =
process1.fold(
Map
.empty[Contract, EndOfDayPosition]
.withDefault(EndOfDayPosition(_, 0))
val eod = s(tr.contract)
s + (tr.contract -> eod.copy(position = tr.side match {
case Buy => eod.position + tr.lots
case Sell => eod.position - tr.lots
}))
}.flatMap(m => Process.emitAll(m.values.toSeq))
Compute EoD positions

30. import shapeless.contrib.scalacheck._

"summarizing is groupby and sum" >> {
prop { (trs: List[TradeReport]) => (trs.size > 0) ==> {
trs
... must
containTheSameElementsAs(
Process
.emitAll(trs)
.pipe(demo.summarize)
.toList)
}}
Testing Proving it ..
[info] Finished in 671 ms
[info] 1 example, 100 expectations, 0 failure, 0 error
ίʔυΛςετ…͡Όͳͯ͘ূ໌͢Δ

fetcher.run ::
endOfDaySummary.run :: Nil
)
The Finale

32. In 30 minutes from now
• You will know
• why you would want to use scalaz-stream
• the basic building blocks of scalaz-stream
• iteratively built a small report generation
͜ΕͰ͋ͳͨ΋ scalaz-stream Λ࢖͍ͨ͘ͳͬͨ͸ͣ

33. One more thing
• Before: scalaz-stream
• Soon: Functional Streams for Scala / fs2
• library with 0 dependencies
• Process[F, O] becomes Stream[F, W]
scalaz-stream ͸ fs2 ʹ໊લ͕มΘͬͨ
ґଘϥΠϒϥϦ͸θϩ

34. Q&A / Thank you!

@suls
github.com/suls