Keep your sanity with compositional tracing

Keep your sanity with compositional tracing Photo by Esther Wilhelmsson
on Unsplash Jakub Kozłowski |> Typelevel Summit |> New York City |> 12.03.2020 speakerdeck.com/kubukoz

Standard ways of inspecting live systems Inspecting distributed systems Implementation
hackery! In this talk

type Result = Int val processPayment: String => Result =
{ ... } val result = processPayment(paymentId) Scenario: process payment with an ID

logger.info(s"Started processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing
payment with result $result") <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException</pattern> Let's log things!

payment with result $result") 2020-03-07 19:21:39.817 [INFO] Started processing payment 3 2020-03-07 19:21:39.817 [INFO] Started processing payment 2 2020-03-07 19:21:39.817 [INFO] Started processing payment 1 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 10 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 85 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 46 <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException</pattern> Let's log things!

payment with result $result") 2020-03-07 19:21:39.817 [INFO] Started processing payment 3 2020-03-07 19:21:39.817 [INFO] Started processing payment 2 2020-03-07 19:21:39.817 [INFO] Started processing payment 1 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 10 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 85 2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 46 Which result came from which paymentId? <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException</pattern> Let's log things!

<pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException</pattern> <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%thread] %message%n%xException</pattern> logger.info(s"Started
processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing payment with result $result") Let's add the executing thread!

<pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException</pattern> <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%thread] %message%n%xException</pattern> logger.info(s"Started
processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing payment with result $result") 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment with result 81 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment with result 52 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment with result 60 Let's add the executing thread!

2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2 2020-03-07 19:26:38.490
[INFO] [scala-execution-context-global-19] Started processing payment 3 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment with result 81 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment with result 52 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment with result 60 ... ... ... 2020-03-07 19:27:38.493 [INFO] [scala-execution-context-global-19] Oh, one more thing Is this still the same request? When is the request over?

[INFO] [scala-execution-context-global-19] Started processing payment 3 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment 3 with result 81 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment 2 with result 52 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment 1 with result 60 So... logging payment ID in each message?

[INFO] [scala-execution-context-global-19] Started processing payment 3 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment 3 with result 81 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment 2 with result 52 2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment 1 with result 60 So... logging payment ID in each message? Have fun grepping!

package org.slf4j //rough Scala translation object MDC { def put(key:
String, value: String): Unit def putCloseable(key: String, value: String): Closeable def remove(key: String): Unit def getCopyOfContextMap(): Map[String, String] def clear(): Unit //... } Mapped Diagnostic Context (MDC)

val processPayment: String => Result = { ... } val
requestId = UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing payment with result $result") } <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException</pattern> Logging with MDC

requestId = UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing payment with result $result") } <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException</pattern> 2020-03-07 20:24:26.236 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Started processing payment 3 2020-03-07 20:24:26.236 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Started processing payment 1 2020-03-07 20:24:26.236 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Started processing payment 2 2020-03-07 20:24:26.238 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Finished processing payment with result 25 2020-03-07 20:24:26.238 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Finished processing payment with result 76 2020-03-07 20:24:26.238 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Finished processing payment with result 88 Logging with MDC

requestId = UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") val result = processPayment(paymentId) logger.info(s"Finished processing payment with result $result") } <pattern>%date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException</pattern> 2020-03-07 20:24:26.236 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Started processing payment 3 2020-03-07 20:24:26.236 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Started processing payment 1 2020-03-07 20:24:26.236 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Started processing payment 2 2020-03-07 20:24:26.238 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Finished processing payment with result 25 2020-03-07 20:24:26.238 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Finished processing payment with result 76 2020-03-07 20:24:26.238 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Finished processing payment with result 88 last log with request id - end of request Logging with MDC

val processPayment: String => Future[Result] = { ... } Using.resource(MDC.putCloseable("RequestId",
requestId)) { _ => logger.info(s"Started processing payment $paymentId") processPayment(paymentId).map { result => logger.info(s"Finished processing payment with result $result") } } What if the thread changes?

val processPayment: String => Future[Result] = { ... } Using.resource(MDC.putCloseable("RequestId",
requestId)) { _ => logger.info(s"Started processing payment $paymentId") processPayment(paymentId).map { result => logger.info(s"Finished processing payment with result $result") } } 2020-03-07 20:44:10.781 [INFO] [96a4e310-1f2d-4670-80c8-eb28c0561e4a] Started processing payment 2 2020-03-07 20:44:10.781 [INFO] [1afb3aa6-c7ec-4c56-b9a8-c5ccf2c13cf3] Started processing payment 1 2020-03-07 20:44:10.781 [INFO] [d857fe1d-48cf-4f58-b369-8705b31f9b8d] Started processing payment 3 2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 46 2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 89 2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 73 What happened? What if the thread changes?

Async programming on the JVM doThing { result => doOtherThing(result)
{ secondResult => doMoreThings(result, secondResult) } }

Async programming in Scala doThing.flatMap { result => doOtherThing(result).flatMap {
secondResult => doMoreThings(result, secondResult) } }

Async programming in Scala for { result <- doThing secondResult
<- doOtherThing(result) out <- doMoreThings(result, secondResult) } yield out

What happens in map / flatMap ? Future .successful(42) .map(f)
.map(g)

What happens in map / flatMap ? Future .successful(42) .map(f)(ec1)
.map(g)(ec2)

What happens in map / flatMap ? Future .successful(42) .map(f)(ec1)
.map(g)(ec2) ec1.execute { () => val x = f(42) ec2.execute { val y = g(x) ... } }

Where is context lost? ec1.execute { () => val x
= f(42) ... }

Propagating context between submission to EC and execution def runWithContext[A](ctx:
Map[String, String])(f: => A): A = { val oldContext = getContext() setContext(ctx) try f finally setContext(oldContext) }

Propagating context between submission to EC and execution // Simplified
definitions type Runnable = () => Unit type ExecutionContext = Runnable => Unit

definitions type Runnable = () => Unit type ExecutionContext = Runnable => Unit object mdcWithContext { def getContext(): Map[String, String] = MDC.getCopyOfContextMap().asScala.toMap def setContext(ctx: Map[String, String]): Unit = MDC.setContextMap(ctx.asJava) }

definitions type Runnable = () => Unit type ExecutionContext = Runnable => Unit object mdcWithContext { def getContext(): Map[String, String] = MDC.getCopyOfContextMap().asScala.toMap def setContext(ctx: Map[String, String]): Unit = MDC.setContextMap(ctx.asJava) } def mdcAware(ec: ExecutionContext): ExecutionContext = runnable => { val rootContext = mdcWithContext.getContext() ec(() => mdcWithContext.runWithContext(rootContext)(runnable())) }

Mapping on an instrumented ExecutionContext val rootContext = getContext() ec1.execute
{ () => val storedContext1 = getContext() setContext(rootContext) try { val x = f(42) //... } finally { setContext(storedContext1) } } ec1.execute { () => val x = f(42) ... }

Mapping on an instrumented ExecutionContext ec1.execute { () => val
x = f(42) ... } val rootContext = getContext() ec1.execute { () => val storedContext1 = getContext() setContext(rootContext) try { val x = f(42) //... } finally { setContext(storedContext1) } }

Mapping on an instrumented ExecutionContext val rootContext = getContext() ec1.execute
{ () => val storedContext1 = getContext() setContext(rootContext) try { val x = f(42) //... } finally { setContext(storedContext1) } }

implicit val ec = mdcAware(ExecutionContext.global) Keeping MDC on task submission
to ExecutionContext

implicit val ec = mdcAware(ExecutionContext.global) 2020-03-07 20:47:23.779 [INFO] [85c52920-4271-40f0-a150-d34bf4f8b5ec] Started
processing payment 2 2020-03-07 20:47:23.779 [INFO] [78e9ccbc-98cb-440a-848c-d72f917857dd] Started processing payment 1 2020-03-07 20:47:23.779 [INFO] [d57bf741-51fc-45e6-a6ca-58f6488c9d05] Started processing payment 3 2020-03-07 20:47:23.785 [INFO] [d57bf741-51fc-45e6-a6ca-58f6488c9d05] Finished processing payment with result 96 2020-03-07 20:47:23.787 [INFO] [78e9ccbc-98cb-440a-848c-d72f917857dd] Finished processing payment with result 77 2020-03-07 20:47:23.785 [INFO] [85c52920-4271-40f0-a150-d34bf4f8b5ec] Finished processing payment with result 55 Keeping MDC on task submission to ExecutionContext

Sure! implicit val contextShift: ContextShift[IO] = IO.contextShift(ec) Does this work
with IO?

val processPayment: String => IO[Result] = { ... } def
executeRequest(paymentId: String) = { val requestId = ju.UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") processPayment(paymentId).map { result => logger.info(s"Finished processing payment with result $result") } } } Let's try it

executeRequest(paymentId: String) = { val requestId = ju.UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") processPayment(paymentId).map { result => logger.info(s"Finished processing payment with result $result") } } } 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 2 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 4 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 3 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 1 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 52 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 19 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 93 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 97 Let's try it

executeRequest(paymentId: String) = { val requestId = ju.UUID.randomUUID().toString() Using.resource(MDC.putCloseable("RequestId", requestId)) { _ => logger.info(s"Started processing payment $paymentId") processPayment(paymentId).map { result => logger.info(s"Finished processing payment with result $result") } } } 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 2 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 4 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 3 2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 1 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 52 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 19 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 93 2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 97 Let's try it Lazy evaluation!

Little lesson in hackery def getContext(): Map[String, String] def setContext(ctx:
Map[String, String]): Unit def runWithContext[A]( ctx: Map[String, String] )(f: => A): A = { val rootContext = getContext() setContext(ctx) try f finally setContext(rootContext) }

val getContext: IO[Map[String, String]] def setContext(ctx: Map[String, String]): IO[Unit] def
withContext[A]( ctx: Map[String, String] )(f: IO[A]): IO[A] = getContext.flatMap { rootContext => setContext(ctx) .bracket(_ => f)( _ => setContext(rootContext) ) } Impure Little lesson in hackery def getContext(): Map[String, String] def setContext(ctx: Map[String, String]): Unit def runWithContext[A]( ctx: Map[String, String] )(f: => A): A = { val rootContext = getContext() setContext(ctx) try f finally setContext(rootContext) } Pure

val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap) def setContext(ctx: Map[String, String]):
IO[Unit] = ???

val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap) def setContext(ctx: Map[String, String]):
IO[Unit] = IO(MDC.setContextMap(ctx.asJava)) That'll work... for MDC Hack #1 - just set the context

Why does this work? withContext(newContext)( fa = launchMissiles ) def
setContext(ctx: Map[String, String]): IO[Unit] = IO(MDC.setContextMap(ctx.asJava))

Why does this work? withContext(newContext)( fa = IO.shift(mdcAware(ec)) *> launchMissiles
) def setContext(ctx: Map[String, String]): IO[Unit] = IO(MDC.setContextMap(ctx.asJava))

def setContext(ctx: Map[String, String]): IO[Unit] = IO.async[Unit] { cb =>
runWithContext(ctx) { cb(Right(())) } } Hack #2 - use runWithContext and async

def setContext(ctx: Map[String, String]): IO[Unit] = IO.async[Unit] { cb =>
runWithContext(ctx) { cb(Right(())) } } This will work* for APIs that don't directly provide setContext Hack #2 - use runWithContext and async * - with one more change we'll see in a minute

Why does THAT work? - IO.async 101 object IO {
def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A] } How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A] } cb How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A] } cb val sleep5s = IO.async[Int] { cb: (Either[Throwable, Int] => Unit) => scheduler.schedule(cb(Right(42)), 5.seconds) } val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i))) How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

Why does THAT work? - IO.async 101 val sleep5s =
IO.async[Int] { cb: (Either[Throwable, Int] => Unit) => scheduler.schedule(cb(Right(42)), 5.seconds) } val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i))) Mental model How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

IO.async[Int] { cb: (Either[Throwable, Int] => Unit) => scheduler.schedule(cb(Right(42)), 5.seconds) } val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i))) val cb = { case Right(i: Int) => sendMissiles(count = i) case Left(e) => throw e } Mental model How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

IO.async[Int] { cb: (Either[Throwable, Int] => Unit) => scheduler.schedule(cb(Right(42)), 5.seconds) } val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i))) val cb = { case Right(i: Int) => sendMissiles(count = i) case Left(e) => throw e } Mental model val prog = scheduler.schedule(sendMissiles(42), ...) => How do Fibers Work? A Peek Under the Hood https://youtu.be/x5_MmZVLiSM

Why does this work? withContext(newContext)( fa = launchMissiles ) *>
remainder def setContext(ctx: Map[String, String]): IO[Unit] = IO.async[Unit] { cb => runWithContext(ctx) { cb(Right(())) } }

Why does this work? withContext(newContext)( fa = IO.shift(mdcAware(ec)) *> launchMissiles
) *> remainder def setContext(ctx: Map[String, String]): IO[Unit] = IO.async[Unit] { cb => runWithContext(ctx) { cb(Right(())) } }

But wait, there's more! def setContext(ctx: Map[String, String]): IO[Unit] =
IO.shift *> IO.async[Unit] { cb => runWithContext(ctx) { cb(Right(())) } }

(Cats) IO trampolines callbacks for stack safety def showAsync(tag: String):
IO[Unit] = IO.async[Unit] { cb => println(s"before $tag") cb(Right(())) println(s"after $tag") } showAsync("cb1") *> showAsync("cb2") *> IO(println("foo"))

(Cats) IO trampolines callbacks for stack safety def showAsync(tag: String):
IO[Unit] = IO.async[Unit] { cb => println(s"before $tag") cb(Right(())) println(s"after $tag") } showAsync("cb1") *> showAsync("cb2") *> IO(println("foo")) before cb1 before cb2 after cb2 foo after cb1

Shifting before async makes sure running cb evaluates the continuation
immediately def showAsync(tag: String): IO[Unit] = IO.shift *> IO.async[Unit] { cb => println(s"before $tag") cb(Right(())) println(s"after $tag") } showAsync("cb1") *> showAsync("cb2") *> IO(println("foo")) before cb1 after cb1 before cb2 foo after cb2

Which one should I use? The idea is still pretty
fresh, but right now:

Which one should I use? The idea is still pretty
fresh, but right now: val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap) def setContext(ctx: Map[String, String]): IO[Unit] = IO(MDC.setContextMap(ctx.asJava)) def withContext[A]( ctx: Map[String, String] )(f: IO[A]): IO[A] = getContext.flatMap { rootContext => setContext(ctx) .bracket(_ => f)( _ => setContext(rootContext) ) } Hack #1 (for simplicity)

Are we done? Let's wait till 3:00 am

Who called this application? Are we done? Let's wait till
3:00 am

Who called this application? What applications did it call? Which
URLs? Are we done? Let's wait till 3:00 am

URLs? What were the response times? Are we done? Let's wait till 3:00 am

URLs? What were the response times? How long did the whole call take? Are we done? Let's wait till 3:00 am

URLs? What were the response times? How long did the whole call take? Were there any (caught) errors? Are we done? Let's wait till 3:00 am

Logs are linear 19:43:59.023 [INFO] Started processing payment 1 19:43:59.026
[INFO] Calling database 19:43:59.137 [INFO] Got result from database 19:43:59.154 [INFO] Calling remote service 19:43:59.155 [INFO] Reading file 19:43:59.208 [INFO] Called remote service 19:43:59.360 [INFO] Read file def executeRequest(paymentId: String) = for { _ <- logger.info(s"Started processing payment $paymentId") _ <- logger.info("Calling database") _ <- callDatabase _ <- logger.info("Got result from database") result <- ( logger.info("Calling remote service") *> callRemoteService <* logger.info("Called remote service"), logger.info("Reading file") *> readFile <* logger.info("Read file") ).parMapN(combineResults) } yield result time Have fun subtracting timestamps!

What we need def executeRequest(paymentId: String) = span("process") { for
{ _ <- logger.info(s"Processing payment $paymentId") _ <- span("db")(callDatabase) result <- ( span("remote")(callRemoteService), span("file")(readFile) ).parMapN(combineResults) } yield result }

What we need def executeRequest(paymentId: String) = span("process") { for
{ _ <- logger.info(s"Processing payment $paymentId") _ <- span("db")(callDatabase) result <- ( span("remote")(callRemoteService), span("file")(readFile) ).parMapN(combineResults) } yield result } - immediately see order of execution - concurrency becomes obvious - inspect details on demand (expand span)

"...a form of structured logging that assumes duration and hierarchy"
Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/ Distributed tracing

Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/ Distributed tracing Trace

Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/ Distributed tracing Trace Span

Distributed tracing solutions

Case study - log before - async call to database
(simulated by sleep) - async call to remote service - log afterwards new BusinessLogic[F] { def execute(args: Args): F[Result] = for { _ <- Logger[F].info(show"Executing request $args") _ <- Timer[F].sleep(100.millis) _ <- client.successful(POST(uri"http://localhost:8080/execute")) _ <- Logger[F].info(show"Executed request $args") } yield Result(show"${args.message} finished") } Reference implementation (no tracing) Check out the source code for details (repo linked in ﬁnal slide)

Create our own model of a span for simplicity Keep
track via parameter passing Provide to logger/client which will set info as MDC Convert to zipkin model and report manually after each span Reporting traces - direct, no-magic approach

Simple span model final case class Span( name: String, traceId:
String, spanId: String, parentSpanId: Option[String], values: Map[String, String] )

Reporting traces - direct, no-magic approach def databaseCall(rootSpan: Span) =
newSpan( "db-call", rootSpan, Map("db.query" -> "select * from users where id = ?") ) { span => logger.info(span.toMap)("Running db call") *> IO.sleep(100.millis) }

newSpan( "db-call", rootSpan, Map("db.query" -> "select * from users where id = ?") ) { span => logger.info(span.toMap)("Running db call") *> IO.sleep(100.millis) } def clientCall(rootSpan: Span) = newSpan("remote-call", rootSpan) { child => client.successful( POST( uri"http://localhost:8080/execute", child.toTraceHeaders.toList: _* ) ) }

newSpan( "db-call", rootSpan, Map("db.query" -> "select * from users where id = ?") ) { span => logger.info(span.toMap)("Running db call") *> IO.sleep(100.millis) } def clientCall(rootSpan: Span) = newSpan("remote-call", rootSpan) { child => client.successful( POST( uri"http://localhost:8080/execute", child.toTraceHeaders.toList: _* ) ) } def execute(args: Args, span: Span): IO[Result] = for { _ <- logger.info(span.toMap)(show"Executing request $args") _ <- databaseCall(span) _ <- clientCall(span) _ <- logger.info(span.toMap)(show"Executed request $args") } yield Result(show"${args.message} finished")

What we'll get in UI (applies to all our approaches
here)

Distributed tracing - Kamon, same threadlocal magic as MDC val
databaseCall = childSpan( "db-call", Map("db.query" -> "select * from users where id = ?") ) { logger.info("Running db call") *> IO.sleep(100.millis) } val clientCall = childSpan("remote-call") { //grab context before, set it again afterwards Tracing[IO].keepSpanAround { client.successful(POST(uri"http://localhost:8080/execute")) } } def execute(args: Args): IO[Result] = for { _ <- logger.info(show"Executing request $args") _ <- databaseCall _ <- clientCall _ <- logger.info(show"Executed request $args") } yield Result(show"${args.message} finished")

Comparison Direct tracing (parameter passing) - Lots of boilerplate -
Viral (inﬂuences APIs on all levels) - Extra work to integrate with MDC - Reporting done manually + Zero magic + No loss of context Kamon tracing (instrumentation) - Lots of magic (AOP, thread locals) - Complex and fragile instrumentation for async IO - Each interface for async needs special instrumentation + No changes in APIs + No boilerplate + Lots of integrations + Handles reporting for you + Context passed to MDC

trait BusinessLogic[F[_]] { def execute(args: Args, span: Span): F[Result] }
trait AnotherProgram[F[_]] { def summary(result: Result, span: Span): F[Unit] } trait BusinessLogic[F[_]] { def execute(args: Args): F[Result] } trait AnotherProgram[F[_]] { def summary(result: Result): F[Unit] } => Who's the winner? Both solutions have shortcomings. Ideally... No boilerplate No magic No virality Lots of integrations MDC support Automatic reporting

@finalAlg trait BusinessLogic[F[_]] { def execute(args: Args): F[Result] } @finalAlg
trait AnotherProgram[F[_]] { def summary(result: Result): F[Unit] } def program[F[_]: FlatMap: BusinessLogic: AnotherProgram: Trace](args: Args): F[Unit] = for { result <- Trace[F].span("execute")(BusinessLogic[F].execute(args)) _ <- Trace[F].span("summary")(AnotherProgram[F].summary(result)) } yield () type M[A] = ??? def execute(args: Args): M[A] = program[M](args) What would it look like?

What's the F? def program[F[_]: FlatMap: BusinessLogic: AnotherProgram: Trace](args: Args):
F[Unit] = for { result <- Trace[F].span("execute")(BusinessLogic[F].execute(args)) _ <- Trace[F].span("summary")(AnotherProgram[F].summary(result)) } yield () def execute(args: Args) = program[?????](args)

final case class MyMagicMonadForContext[F[_], Context, A](impl: Context => F[A]) Let's
make a new monad

final case class MyMagicMonadForContext[F[_], Context, A](impl: Context => F[A]) {
def flatMap[B]( f: A => MyMagicMonadForContext[F, Context, B] )( implicit F: Monad[F] ): MyMagicMonadForContext[F, Context, B] = MyMagicMonadForContext { ctx => impl(ctx).flatMap(f(_).impl(ctx)) } def local(f: Context => Context): MyMagicMonadForContext[F, Context, A] = MyMagicMonadForContext(f andThen impl) def scope(ctx: Context): MyMagicMonadForContext[F, Context, A] = local(_ => ctx) } Let's make a new monad

We made a new monad!

type AuthenticatedT [F[_], A] = MyMagicMonadForContext[F, User, A] type Transactional
[F[_], A] = MyMagicMonadForContext[F, Connection, A] type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A] type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A] We made a new monad!

[F[_], A] = MyMagicMonadForContext[F, Connection, A] type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A] type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A] type ZIO[-R, +E, +A] = MyMagicMonadForContext[EitherT[IO, E, *], R, A] semantically ;) We made a new monad!

[F[_], A] = MyMagicMonadForContext[F, Connection, A] type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A] type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A] type ZIO[-R, +E, +A] = MyMagicMonadForContext[EitherT[IO, E, *], R, A] semantically ;) https://youtu.be/7xSfLPD6tiQ?t=707 cats We made a new monad!

Hint: Did we make a new monad?

final case class Kleisli[F[_], -A, B](run: A => F[B]) Hint:
Did we make a new monad?

final case class Kleisli[F[_], -A, B](run: A => F[B]) Hint:
type ReaderT[F[_], -A, B] = Kleisli[F, A, B] val ReaderT = Kleisli Did we make a new monad?

Why is ReaderT the perfect ﬁt - automatic passing of
context - get current context for free with Kleisli.ask: Kleisli[F, A, A] - change context in a region easily with local

However... - we still need integrations (wrapping clients in ones
that take headers from context in Kleisli) - we need an escape hatch for when we really need MDC - we need a cool library that provides a nice API on top of Kleisli...

Natchez API trait Span[F[_]] { def put(fields: (String, TraceValue)*): F[Unit]
def kernel: F[Kernel] def span(name: String): Resource[F, Span[F]] } trait Trace[F[_]] { def put(fields: (String, TraceValue)*): F[Unit] def kernel: F[Kernel] def span[A](name: String)(k: F[A]): F[A] } object Trace { implicit def kleisliInstance[F[_]: Bracket[*[_], Throwable]]: Trace[Kleisli[F, Span[F], *]] = ... }

Quick teaser of usage def execute(args: Args): F[Result] = for
{ _ <- Logger[F].info(show"Executing request $args") _ <- Trace[F].span("db")(Timer[F].sleep(100.millis)) _ <- client.successful(POST(uri"http://localhost:8080/execute" _ <- Logger[F].info(show"Executed request $args") } yield Result(show"${args.message} finished")

Quick teaser of usage type Traced[A] = Kleisli[IO, Span[IO], A]
val businessLogic = BusinessLogic.instance[Traced] def exec(msg: String): IO[Result] = Zipkin .entryPoint[IO]("client", blocker) .use { _.root(msg).use { businessLogic.execute(Args(msg)).run } } def execute(args: Args): F[Result] = for { _ <- Logger[F].info(show"Executing request $args") _ <- Trace[F].span("db")(Timer[F].sleep(100.millis)) _ <- client.successful(POST(uri"http://localhost:8080/execute" _ <- Logger[F].info(show"Executed request $args") } yield Result(show"${args.message} finished")

val businessLogic = BusinessLogic.instance[Traced] def exec(msg: String): IO[Result] = Zipkin .entryPoint[IO]("client", blocker) .use { _.root(msg).use { businessLogic.execute(Args(msg)).run } } def execute(args: Args): F[Result] = for { _ <- Logger[F].info(show"Executing request $args") _ <- Trace[F].span("db")(Timer[F].sleep(100.millis)) _ <- client.successful(POST(uri"http://localhost:8080/execute" _ <- Logger[F].info(show"Executed request $args") } yield Result(show"${args.message} finished") def kleisliTracedClient[F[_]: Bracket[*[_], Throwable]: Defer]( underlying: Client[F] ): Client[Kleisli[F, Span[F], *]] = ...

val businessLogic = BusinessLogic.instance[Traced] def exec(msg: String): IO[Result] = businessLogic.execute(Args(msg)).run def execute(args: Args): F[Result] = for { _ <- Logger[F].info(show"Executing request $args") _ <- Trace[F].span("db")(Timer[F].sleep(100.millis)) _ <- client.successful(POST(uri"http://localhost:8080/execute" _ <- Logger[F].info(show"Executed request $args") } yield Result(show"${args.message} finished") def kleisliTracedClient[F[_]: Bracket[*[_], Throwable]: Defer]( underlying: Client[F] ): Client[Kleisli[F, Span[F], *]] = ... Zipkin .entryPoint[IO]("client", blocker) .use { _.root(msg).use { } } This should eventually be delegated to e.g. http4s server wrapper

Integrations? As of now Soon In the meantime... Using Kamon
for reporting?

Does it work? Open the examples of this talk and
see for yourself (including temporary custom zipkin integration)

Thank you blog.kubukoz.com @kubukoz Slides: speakerdeck.com/kubukoz Code: git.io/Jv6mN Find me
on YouTube! (https://youtube.com/channel/UCBSRCuGz9laxVv0rAnn2O9Q)

Keep your sanity with compositional tracing

Keep your sanity with compositional tracing

More Decks by Jakub Kozłowski

Other Decks in Programming

Featured

Transcript