Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keep your sanity with compositional tracing

Keep your sanity with compositional tracing

Logging is not enough for building distributed systems that you can investigate in case of production issues (which will happen). As an alternative, we can use tracing.

We’ll learn why logging is not the adequate tool for finding issues in distributed systems and how tracing solves its problems.

Later, we’ll see how it can be implemented, starting from the simplest possible solution, progressing through a clean, compositional solution that doesn’t clutter the API with details of tracing, and works with HTTP requests as well as asynchronous message handling.

Jakub Kozłowski

March 12, 2020
Tweet

More Decks by Jakub Kozłowski

Other Decks in Programming

Transcript

  1. Keep your sanity
    with compositional tracing
    Photo by Esther Wilhelmsson on Unsplash
    Jakub Kozłowski |>
    Typelevel Summit |>
    New York City |>
    12.03.2020
    speakerdeck.com/kubukoz

    View Slide

  2. Standard ways of inspecting live systems

    Inspecting distributed systems

    Implementation hackery!
    In this talk

    View Slide

  3. type Result = Int
    val processPayment: String => Result = { ... }
    val result = processPayment(paymentId)
    Scenario: process payment with an ID

    View Slide

  4. logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    Let's log things!

    View Slide

  5. logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    2020-03-07 19:21:39.817 [INFO] Started processing payment 3
    2020-03-07 19:21:39.817 [INFO] Started processing payment 2
    2020-03-07 19:21:39.817 [INFO] Started processing payment 1
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 10
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 85
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 46
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    Let's log things!

    View Slide

  6. logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    2020-03-07 19:21:39.817 [INFO] Started processing payment 3
    2020-03-07 19:21:39.817 [INFO] Started processing payment 2
    2020-03-07 19:21:39.817 [INFO] Started processing payment 1
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 10
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 85
    2020-03-07 19:21:39.820 [INFO] Finished processing payment with result 46
    Which result came from which paymentId?
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    Let's log things!

    View Slide

  7. %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%thread] %message%n%xException
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    Let's add the executing thread!

    View Slide

  8. %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%thread] %message%n%xException
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment with result 81
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment with result 52
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment with result 60
    Let's add the executing thread!

    View Slide

  9. %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] %message%n%xException
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%thread] %message%n%xException
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment with result 81
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment with result 52
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment with result 60
    Let's add the executing thread!

    View Slide

  10. View Slide

  11. View Slide

  12. 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment with result 81
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment with result 52
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment with result 60
    ...
    ...
    ...
    2020-03-07 19:27:38.493 [INFO] [scala-execution-context-global-19] Oh, one more thing
    Is this still the same request?
    When is the request over?

    View Slide

  13. 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment 3 with result 81
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment 2 with result 52
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment 1 with result 60
    So... logging payment ID in each message?

    View Slide

  14. 2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-17] Started processing payment 2
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-19] Started processing payment 3
    2020-03-07 19:26:38.490 [INFO] [scala-execution-context-global-15] Started processing payment 1
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-15] Finished processing payment 3 with result 81
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-19] Finished processing payment 2 with result 52
    2020-03-07 19:26:38.493 [INFO] [scala-execution-context-global-17] Finished processing payment 1 with result 60
    So... logging payment ID in each message?
    Have fun grepping!

    View Slide

  15. package org.slf4j
    //rough Scala translation
    object MDC {
    def put(key: String, value: String): Unit
    def putCloseable(key: String, value: String): Closeable
    def remove(key: String): Unit
    def getCopyOfContextMap(): Map[String, String]
    def clear(): Unit
    //...
    }
    Mapped Diagnostic Context (MDC)

    View Slide

  16. val processPayment: String => Result = { ... }
    val requestId = UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    }
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException
    Logging with MDC

    View Slide

  17. val processPayment: String => Result = { ... }
    val requestId = UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    }
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException
    2020-03-07 20:24:26.236 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Started processing payment 3
    2020-03-07 20:24:26.236 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Started processing payment 1
    2020-03-07 20:24:26.236 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Started processing payment 2
    2020-03-07 20:24:26.238 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Finished processing payment with result 25
    2020-03-07 20:24:26.238 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Finished processing payment with result 76
    2020-03-07 20:24:26.238 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Finished processing payment with result 88
    Logging with MDC

    View Slide

  18. val processPayment: String => Result = { ... }
    val requestId = UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    val result = processPayment(paymentId)
    logger.info(s"Finished processing payment with result $result")
    }
    %date{YYYY-MM-dd HH:mm:ss.SSS} [%level] [%mdc{RequestId:--}] %message%n%xException
    2020-03-07 20:24:26.236 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Started processing payment 3
    2020-03-07 20:24:26.236 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Started processing payment 1
    2020-03-07 20:24:26.236 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Started processing payment 2
    2020-03-07 20:24:26.238 [INFO] [29e707e5-5186-4e31-9d6b-9f73e96d4198] Finished processing payment with result 25
    2020-03-07 20:24:26.238 [INFO] [c2db42d8-8a1b-4e9d-b826-bf7fd680058a] Finished processing payment with result 76
    2020-03-07 20:24:26.238 [INFO] [9a7ef5de-eac2-413a-903b-8516ff956f8c] Finished processing payment with result 88
    last log with request id - end of request
    Logging with MDC

    View Slide

  19. val processPayment: String => Future[Result] = { ... }
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    processPayment(paymentId).map { result =>
    logger.info(s"Finished processing payment with result $result")
    }
    }
    What if the thread changes?

    View Slide

  20. val processPayment: String => Future[Result] = { ... }
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    processPayment(paymentId).map { result =>
    logger.info(s"Finished processing payment with result $result")
    }
    }
    2020-03-07 20:44:10.781 [INFO] [96a4e310-1f2d-4670-80c8-eb28c0561e4a] Started processing payment 2
    2020-03-07 20:44:10.781 [INFO] [1afb3aa6-c7ec-4c56-b9a8-c5ccf2c13cf3] Started processing payment 1
    2020-03-07 20:44:10.781 [INFO] [d857fe1d-48cf-4f58-b369-8705b31f9b8d] Started processing payment 3
    2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 46
    2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 89
    2020-03-07 20:44:10.790 [INFO] [-] Finished processing payment with result 73

    What happened?
    What if the thread changes?

    View Slide

  21. Async programming on the JVM
    doThing { result =>
    doOtherThing(result) { secondResult =>
    doMoreThings(result, secondResult)
    }
    }

    View Slide

  22. Async programming in Scala
    doThing.flatMap { result =>
    doOtherThing(result).flatMap { secondResult =>
    doMoreThings(result, secondResult)
    }
    }

    View Slide

  23. Async programming in Scala
    for {
    result <- doThing
    secondResult <- doOtherThing(result)
    out <- doMoreThings(result, secondResult)
    } yield out

    View Slide

  24. What happens in map / flatMap ?
    Future
    .successful(42)
    .map(f)
    .map(g)

    View Slide

  25. What happens in map / flatMap ?
    Future
    .successful(42)
    .map(f)(ec1)
    .map(g)(ec2)

    View Slide

  26. What happens in map / flatMap ?
    Future
    .successful(42)
    .map(f)(ec1)
    .map(g)(ec2)
    ec1.execute {
    () =>
    val x = f(42)
    ec2.execute {
    val y = g(x)
    ...
    }
    }

    View Slide

  27. Where is context lost?
    ec1.execute {
    () =>
    val x = f(42)
    ...
    }

    View Slide

  28. Where is context lost?
    ec1.execute {
    () =>
    val x = f(42)
    ...
    }

    View Slide

  29. Propagating context between submission to EC and execution
    def runWithContext[A](ctx: Map[String, String])(f: => A): A = {
    val oldContext = getContext()
    setContext(ctx)
    try f
    finally setContext(oldContext)
    }

    View Slide

  30. Propagating context between submission to EC and execution
    // Simplified definitions
    type Runnable = () => Unit
    type ExecutionContext = Runnable => Unit

    View Slide

  31. Propagating context between submission to EC and execution
    // Simplified definitions
    type Runnable = () => Unit
    type ExecutionContext = Runnable => Unit
    object mdcWithContext {
    def getContext(): Map[String, String] =
    MDC.getCopyOfContextMap().asScala.toMap
    def setContext(ctx: Map[String, String]): Unit =
    MDC.setContextMap(ctx.asJava)
    }

    View Slide

  32. Propagating context between submission to EC and execution
    // Simplified definitions
    type Runnable = () => Unit
    type ExecutionContext = Runnable => Unit
    object mdcWithContext {
    def getContext(): Map[String, String] =
    MDC.getCopyOfContextMap().asScala.toMap
    def setContext(ctx: Map[String, String]): Unit =
    MDC.setContextMap(ctx.asJava)
    }
    def mdcAware(ec: ExecutionContext): ExecutionContext =
    runnable => {
    val rootContext = mdcWithContext.getContext()
    ec(() => mdcWithContext.runWithContext(rootContext)(runnable()))
    }

    View Slide

  33. Mapping on an instrumented ExecutionContext
    val rootContext = getContext()
    ec1.execute { () =>
    val storedContext1 = getContext()
    setContext(rootContext)
    try {
    val x = f(42)
    //...
    } finally {
    setContext(storedContext1)
    }
    }
    ec1.execute {
    () =>
    val x = f(42)
    ...
    }

    View Slide

  34. Mapping on an instrumented ExecutionContext
    ec1.execute {
    () =>
    val x = f(42)
    ...
    }
    val rootContext = getContext()
    ec1.execute { () =>
    val storedContext1 = getContext()
    setContext(rootContext)
    try {
    val x = f(42)
    //...
    } finally {
    setContext(storedContext1)
    }
    }

    View Slide

  35. Mapping on an instrumented ExecutionContext
    val rootContext = getContext()
    ec1.execute { () =>
    val storedContext1 = getContext()
    setContext(rootContext)
    try {
    val x = f(42)
    //...
    } finally {
    setContext(storedContext1)
    }
    }

    View Slide

  36. implicit val ec = mdcAware(ExecutionContext.global)
    Keeping MDC on task submission to ExecutionContext

    View Slide

  37. implicit val ec = mdcAware(ExecutionContext.global)
    2020-03-07 20:47:23.779 [INFO] [85c52920-4271-40f0-a150-d34bf4f8b5ec] Started processing payment 2
    2020-03-07 20:47:23.779 [INFO] [78e9ccbc-98cb-440a-848c-d72f917857dd] Started processing payment 1
    2020-03-07 20:47:23.779 [INFO] [d57bf741-51fc-45e6-a6ca-58f6488c9d05] Started processing payment 3
    2020-03-07 20:47:23.785 [INFO] [d57bf741-51fc-45e6-a6ca-58f6488c9d05] Finished processing payment with result 96
    2020-03-07 20:47:23.787 [INFO] [78e9ccbc-98cb-440a-848c-d72f917857dd] Finished processing payment with result 77
    2020-03-07 20:47:23.785 [INFO] [85c52920-4271-40f0-a150-d34bf4f8b5ec] Finished processing payment with result 55
    Keeping MDC on task submission to ExecutionContext

    View Slide

  38. Sure!
    implicit val contextShift: ContextShift[IO] = IO.contextShift(ec)
    Does this work with IO?

    View Slide

  39. Sure!
    implicit val contextShift: ContextShift[IO] = IO.contextShift(ec)
    Does this work with IO?

    View Slide

  40. val processPayment: String => IO[Result] = { ... }
    def executeRequest(paymentId: String) = {
    val requestId = ju.UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    processPayment(paymentId).map { result =>
    logger.info(s"Finished processing payment with result $result")
    }
    }
    }
    Let's try it

    View Slide

  41. val processPayment: String => IO[Result] = { ... }
    def executeRequest(paymentId: String) = {
    val requestId = ju.UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    processPayment(paymentId).map { result =>
    logger.info(s"Finished processing payment with result $result")
    }
    }
    }
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 2
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 4
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 3
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 1
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 52
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 19
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 93
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 97
    Let's try it

    View Slide

  42. val processPayment: String => IO[Result] = { ... }
    def executeRequest(paymentId: String) = {
    val requestId = ju.UUID.randomUUID().toString()
    Using.resource(MDC.putCloseable("RequestId", requestId)) { _ =>
    logger.info(s"Started processing payment $paymentId")
    processPayment(paymentId).map { result =>
    logger.info(s"Finished processing payment with result $result")
    }
    }
    }
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 2
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 4
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 3
    2020-03-08 16:20:47.088 [INFO] [-] Started processing payment 1
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 52
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 19
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 93
    2020-03-08 16:20:47.216 [INFO] [-] Finished processing payment with result 97
    Let's try it
    Lazy evaluation!

    View Slide

  43. Little lesson in hackery
    def getContext(): Map[String, String]
    def setContext(ctx: Map[String, String]): Unit
    def runWithContext[A](
    ctx: Map[String, String]
    )(f: => A): A = {
    val rootContext = getContext()
    setContext(ctx)
    try f
    finally setContext(rootContext)
    }

    View Slide

  44. val getContext: IO[Map[String, String]]
    def setContext(ctx: Map[String, String]): IO[Unit]
    def withContext[A](
    ctx: Map[String, String]
    )(f: IO[A]): IO[A] =
    getContext.flatMap { rootContext =>
    setContext(ctx)
    .bracket(_ => f)(
    _ => setContext(rootContext)
    )
    }
    Impure
    Little lesson in hackery
    def getContext(): Map[String, String]
    def setContext(ctx: Map[String, String]): Unit
    def runWithContext[A](
    ctx: Map[String, String]
    )(f: => A): A = {
    val rootContext = getContext()
    setContext(ctx)
    try f
    finally setContext(rootContext)
    }
    Pure

    View Slide

  45. val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap)
    def setContext(ctx: Map[String, String]): IO[Unit] = ???

    View Slide

  46. val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap)
    def setContext(ctx: Map[String, String]): IO[Unit] = IO(MDC.setContextMap(ctx.asJava))
    That'll work... for MDC
    Hack #1 - just set the context

    View Slide

  47. Why does this work?
    withContext(newContext)(
    fa = launchMissiles
    )
    def setContext(ctx: Map[String, String]): IO[Unit] =
    IO(MDC.setContextMap(ctx.asJava))

    View Slide

  48. Why does this work?
    withContext(newContext)(
    fa = IO.shift(mdcAware(ec)) *>
    launchMissiles
    )
    def setContext(ctx: Map[String, String]): IO[Unit] =
    IO(MDC.setContextMap(ctx.asJava))

    View Slide

  49. def setContext(ctx: Map[String, String]): IO[Unit] =
    IO.async[Unit] { cb =>
    runWithContext(ctx) {
    cb(Right(()))
    }
    }
    Hack #2 - use runWithContext and async

    View Slide

  50. def setContext(ctx: Map[String, String]): IO[Unit] =
    IO.async[Unit] { cb =>
    runWithContext(ctx) {
    cb(Right(()))
    }
    }
    This will work* for APIs that don't directly provide setContext
    Hack #2 - use runWithContext and async
    * - with one more change we'll see in a minute

    View Slide

  51. Why does THAT work? - IO.async 101
    object IO {
    def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A]
    }
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  52. Why does THAT work? - IO.async 101
    object IO {
    def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A]
    } cb
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  53. Why does THAT work? - IO.async 101
    object IO {
    def async[A](k: (Either[Throwable, A] => Unit) => Unit): IO[A]
    } cb
    val sleep5s = IO.async[Int] { cb: (Either[Throwable, Int] => Unit) =>
    scheduler.schedule(cb(Right(42)), 5.seconds)
    }
    val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i)))
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  54. Why does THAT work? - IO.async 101
    val sleep5s = IO.async[Int] { cb: (Either[Throwable, Int] => Unit) =>
    scheduler.schedule(cb(Right(42)), 5.seconds)
    }
    val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i)))
    Mental model
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  55. Why does THAT work? - IO.async 101
    val sleep5s = IO.async[Int] { cb: (Either[Throwable, Int] => Unit) =>
    scheduler.schedule(cb(Right(42)), 5.seconds)
    }
    val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i)))
    val cb = {
    case Right(i: Int) => sendMissiles(count = i)
    case Left(e) => throw e
    }
    Mental model
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  56. Why does THAT work? - IO.async 101
    val sleep5s = IO.async[Int] { cb: (Either[Throwable, Int] => Unit) =>
    scheduler.schedule(cb(Right(42)), 5.seconds)
    }
    val prog = sleep5s.flatMap(i => IO(sendMissiles(count = i)))
    val cb = {
    case Right(i: Int) => sendMissiles(count = i)
    case Left(e) => throw e
    }
    Mental model
    val prog = scheduler.schedule(sendMissiles(42), ...)
    =>
    How do Fibers Work? A Peek Under the Hood
    https://youtu.be/x5_MmZVLiSM

    View Slide

  57. Why does this work?
    withContext(newContext)(
    fa = launchMissiles
    ) *> remainder
    def setContext(ctx: Map[String, String]): IO[Unit] =
    IO.async[Unit] { cb =>
    runWithContext(ctx) {
    cb(Right(()))
    }
    }

    View Slide

  58. Why does this work?
    withContext(newContext)(
    fa = IO.shift(mdcAware(ec)) *>
    launchMissiles
    ) *> remainder
    def setContext(ctx: Map[String, String]): IO[Unit] =
    IO.async[Unit] { cb =>
    runWithContext(ctx) {
    cb(Right(()))
    }
    }

    View Slide

  59. But wait, there's more!
    def setContext(ctx: Map[String, String]): IO[Unit] =
    IO.shift *>
    IO.async[Unit] { cb =>
    runWithContext(ctx) {
    cb(Right(()))
    }
    }

    View Slide

  60. (Cats) IO trampolines callbacks for stack safety
    def showAsync(tag: String): IO[Unit] =
    IO.async[Unit] { cb =>
    println(s"before $tag")
    cb(Right(()))
    println(s"after $tag")
    }
    showAsync("cb1") *>
    showAsync("cb2") *>
    IO(println("foo"))

    View Slide

  61. (Cats) IO trampolines callbacks for stack safety
    def showAsync(tag: String): IO[Unit] =
    IO.async[Unit] { cb =>
    println(s"before $tag")
    cb(Right(()))
    println(s"after $tag")
    }
    showAsync("cb1") *>
    showAsync("cb2") *>
    IO(println("foo"))
    before cb1
    before cb2
    after cb2
    foo
    after cb1

    View Slide

  62. Shifting before async makes sure running cb evaluates the continuation immediately
    def showAsync(tag: String): IO[Unit] =
    IO.shift *> IO.async[Unit] { cb =>
    println(s"before $tag")
    cb(Right(()))
    println(s"after $tag")
    }
    showAsync("cb1") *>
    showAsync("cb2") *>
    IO(println("foo"))
    before cb1
    after cb1
    before cb2
    foo
    after cb2

    View Slide

  63. Which one should I use?
    The idea is still pretty fresh, but right now:

    View Slide

  64. Which one should I use?
    The idea is still pretty fresh, but right now:
    val getContext: IO[Map[String, String]] = IO(MDC.getCopyOfContextMap().asScala.toMap)
    def setContext(ctx: Map[String, String]): IO[Unit] = IO(MDC.setContextMap(ctx.asJava))
    def withContext[A](
    ctx: Map[String, String]
    )(f: IO[A]): IO[A] =
    getContext.flatMap { rootContext =>
    setContext(ctx)
    .bracket(_ => f)(
    _ => setContext(rootContext)
    )
    }
    Hack #1 (for simplicity)

    View Slide

  65. View Slide

  66. View Slide


  67. Are we done?
    Let's wait till 3:00 am

    View Slide


  68. Are we done?
    Let's wait till 3:00 am

    View Slide


  69. Are we done?
    Let's wait till 3:00 am

    View Slide


  70. Who called this application?
    Are we done?
    Let's wait till 3:00 am

    View Slide


  71. Who called this application?
    What applications did it call? Which URLs?
    Are we done?
    Let's wait till 3:00 am

    View Slide


  72. Who called this application?
    What applications did it call? Which URLs?
    What were the response times?
    Are we done?
    Let's wait till 3:00 am

    View Slide


  73. Who called this application?
    What applications did it call? Which URLs?
    What were the response times?
    How long did the whole call take?
    Are we done?
    Let's wait till 3:00 am

    View Slide


  74. Who called this application?
    What applications did it call? Which URLs?
    What were the response times?
    How long did the whole call take?
    Were there any (caught) errors?
    Are we done?
    Let's wait till 3:00 am

    View Slide

  75. Logs are linear
    19:43:59.023 [INFO] Started processing payment 1
    19:43:59.026 [INFO] Calling database
    19:43:59.137 [INFO] Got result from database
    19:43:59.154 [INFO] Calling remote service
    19:43:59.155 [INFO] Reading file
    19:43:59.208 [INFO] Called remote service
    19:43:59.360 [INFO] Read file
    def executeRequest(paymentId: String) =
    for {
    _ <- logger.info(s"Started processing payment $paymentId")
    _ <- logger.info("Calling database")
    _ <- callDatabase
    _ <- logger.info("Got result from database")
    result <- (
    logger.info("Calling remote service") *>
    callRemoteService <*
    logger.info("Called remote service"),
    logger.info("Reading file") *>
    readFile <*
    logger.info("Read file")
    ).parMapN(combineResults)
    } yield result
    time
    Have fun subtracting timestamps!

    View Slide

  76. What we need
    def executeRequest(paymentId: String) =
    span("process") {
    for {
    _ <- logger.info(s"Processing payment $paymentId")
    _ <- span("db")(callDatabase)
    result <- (
    span("remote")(callRemoteService),
    span("file")(readFile)
    ).parMapN(combineResults)
    } yield result
    }

    View Slide

  77. What we need
    def executeRequest(paymentId: String) =
    span("process") {
    for {
    _ <- logger.info(s"Processing payment $paymentId")
    _ <- span("db")(callDatabase)
    result <- (
    span("remote")(callRemoteService),
    span("file")(readFile)
    ).parMapN(combineResults)
    } yield result
    }
    - immediately see order of execution

    - concurrency becomes obvious

    - inspect details on demand (expand span)

    View Slide

  78. "...a form of structured logging that assumes duration and hierarchy"
    Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/
    Distributed tracing

    View Slide

  79. "...a form of structured logging that assumes duration and hierarchy"
    Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/
    Distributed tracing
    Trace

    View Slide

  80. "...a form of structured logging that assumes duration and hierarchy"
    Source: https://tersesystems.com/blog/2020/03/10/a-taxonomy-of-logging/
    Distributed tracing
    Trace
    Span

    View Slide

  81. Distributed tracing solutions

    View Slide

  82. Case study
    - log before

    - async call to database (simulated by sleep)

    - async call to remote service

    - log afterwards
    new BusinessLogic[F] {
    def execute(args: Args): F[Result] =
    for {
    _ <- Logger[F].info(show"Executing request $args")
    _ <- Timer[F].sleep(100.millis)
    _ <- client.successful(POST(uri"http://localhost:8080/execute"))
    _ <- Logger[F].info(show"Executed request $args")
    } yield Result(show"${args.message} finished")
    }
    Reference implementation (no tracing)
    Check out the source code for details (repo linked in final slide)

    View Slide

  83. Create our own model of a span for simplicity

    Keep track via parameter passing

    Provide to logger/client which will set info as MDC

    Convert to zipkin model and report manually after each span
    Reporting traces - direct, no-magic approach

    View Slide

  84. Simple span model
    final case class Span(
    name: String,
    traceId: String,
    spanId: String,
    parentSpanId: Option[String],
    values: Map[String, String]
    )

    View Slide

  85. Reporting traces - direct, no-magic approach
    def databaseCall(rootSpan: Span) =
    newSpan(
    "db-call",
    rootSpan,
    Map("db.query" -> "select * from users where id = ?")
    ) { span =>
    logger.info(span.toMap)("Running db call") *>
    IO.sleep(100.millis)
    }

    View Slide

  86. Reporting traces - direct, no-magic approach
    def databaseCall(rootSpan: Span) =
    newSpan(
    "db-call",
    rootSpan,
    Map("db.query" -> "select * from users where id = ?")
    ) { span =>
    logger.info(span.toMap)("Running db call") *>
    IO.sleep(100.millis)
    }
    def clientCall(rootSpan: Span) =
    newSpan("remote-call", rootSpan) { child =>
    client.successful(
    POST(
    uri"http://localhost:8080/execute",
    child.toTraceHeaders.toList: _*
    )
    )
    }

    View Slide

  87. Reporting traces - direct, no-magic approach
    def databaseCall(rootSpan: Span) =
    newSpan(
    "db-call",
    rootSpan,
    Map("db.query" -> "select * from users where id = ?")
    ) { span =>
    logger.info(span.toMap)("Running db call") *>
    IO.sleep(100.millis)
    }
    def clientCall(rootSpan: Span) =
    newSpan("remote-call", rootSpan) { child =>
    client.successful(
    POST(
    uri"http://localhost:8080/execute",
    child.toTraceHeaders.toList: _*
    )
    )
    }
    def execute(args: Args, span: Span): IO[Result] =
    for {
    _ <- logger.info(span.toMap)(show"Executing request $args")
    _ <- databaseCall(span)
    _ <- clientCall(span)
    _ <- logger.info(span.toMap)(show"Executed request $args")
    } yield Result(show"${args.message} finished")

    View Slide

  88. What we'll get in UI
    (applies to all our approaches here)

    View Slide

  89. What we'll get in UI
    (applies to all our approaches here)

    View Slide

  90. What we'll get in UI
    (applies to all our approaches here)

    View Slide

  91. Distributed tracing - Kamon, same threadlocal magic as MDC
    val databaseCall =
    childSpan(
    "db-call",
    Map("db.query" -> "select * from users where id = ?")
    ) {
    logger.info("Running db call") *>
    IO.sleep(100.millis)
    }
    val clientCall =
    childSpan("remote-call") {
    //grab context before, set it again afterwards
    Tracing[IO].keepSpanAround {
    client.successful(POST(uri"http://localhost:8080/execute"))
    }
    }
    def execute(args: Args): IO[Result] =
    for {
    _ <- logger.info(show"Executing request $args")
    _ <- databaseCall
    _ <- clientCall
    _ <- logger.info(show"Executed request $args")
    } yield Result(show"${args.message} finished")

    View Slide

  92. Comparison
    Direct tracing (parameter passing)
    - Lots of boilerplate

    - Viral (influences APIs on all levels)

    - Extra work to integrate with MDC

    - Reporting done manually

    + Zero magic

    + No loss of context
    Kamon tracing (instrumentation)
    - Lots of magic (AOP, thread locals)

    - Complex and fragile instrumentation for async IO

    - Each interface for async needs special instrumentation

    + No changes in APIs

    + No boilerplate

    + Lots of integrations

    + Handles reporting for you

    + Context passed to MDC

    View Slide

  93. trait BusinessLogic[F[_]] {
    def execute(args: Args, span: Span): F[Result]
    }
    trait AnotherProgram[F[_]] {
    def summary(result: Result, span: Span): F[Unit]
    }
    trait BusinessLogic[F[_]] {
    def execute(args: Args): F[Result]
    }
    trait AnotherProgram[F[_]] {
    def summary(result: Result): F[Unit]
    }
    =>
    Who's the winner?
    Both solutions have shortcomings. Ideally...

    No boilerplate

    No magic

    No virality

    Lots of integrations

    MDC support

    Automatic reporting

    View Slide

  94. @finalAlg
    trait BusinessLogic[F[_]] {
    def execute(args: Args): F[Result]
    }
    @finalAlg
    trait AnotherProgram[F[_]] {
    def summary(result: Result): F[Unit]
    }
    def program[F[_]: FlatMap: BusinessLogic: AnotherProgram: Trace](args: Args): F[Unit] =
    for {
    result <- Trace[F].span("execute")(BusinessLogic[F].execute(args))
    _ <- Trace[F].span("summary")(AnotherProgram[F].summary(result))
    } yield ()
    type M[A] = ???
    def execute(args: Args): M[A] = program[M](args)
    What would it look like?

    View Slide

  95. What's the F?
    def program[F[_]: FlatMap: BusinessLogic: AnotherProgram: Trace](args: Args): F[Unit] =
    for {
    result <- Trace[F].span("execute")(BusinessLogic[F].execute(args))
    _ <- Trace[F].span("summary")(AnotherProgram[F].summary(result))
    } yield ()
    def execute(args: Args) = program[?????](args)

    View Slide

  96. final case class MyMagicMonadForContext[F[_], Context, A](impl: Context => F[A])
    Let's make a new monad

    View Slide

  97. final case class MyMagicMonadForContext[F[_], Context, A](impl: Context => F[A]) {
    def flatMap[B](
    f: A => MyMagicMonadForContext[F, Context, B]
    )(
    implicit F: Monad[F]
    ): MyMagicMonadForContext[F, Context, B] = MyMagicMonadForContext { ctx =>
    impl(ctx).flatMap(f(_).impl(ctx))
    }
    def local(f: Context => Context): MyMagicMonadForContext[F, Context, A] =
    MyMagicMonadForContext(f andThen impl)
    def scope(ctx: Context): MyMagicMonadForContext[F, Context, A] = local(_ => ctx)
    }
    Let's make a new monad

    View Slide

  98. We made a new monad!

    View Slide

  99. We made a new monad!

    View Slide

  100. type AuthenticatedT [F[_], A] = MyMagicMonadForContext[F, User, A]
    type Transactional [F[_], A] = MyMagicMonadForContext[F, Connection, A]
    type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A]
    type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A]
    We made a new monad!

    View Slide

  101. type AuthenticatedT [F[_], A] = MyMagicMonadForContext[F, User, A]
    type Transactional [F[_], A] = MyMagicMonadForContext[F, Connection, A]
    type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A]
    type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A]
    type ZIO[-R, +E, +A] = MyMagicMonadForContext[EitherT[IO, E, *], R, A]
    semantically ;)
    We made a new monad!

    View Slide

  102. type AuthenticatedT [F[_], A] = MyMagicMonadForContext[F, User, A]
    type Transactional [F[_], A] = MyMagicMonadForContext[F, Connection, A]
    type Traced [F[_], A] = MyMagicMonadForContext[F, Span, A]
    type WithTenant [F[_], A] = MyMagicMonadForContext[F, ClientId, A]
    type ZIO[-R, +E, +A] = MyMagicMonadForContext[EitherT[IO, E, *], R, A]
    semantically ;)
    https://youtu.be/7xSfLPD6tiQ?t=707
    cats
    We made a new monad!

    View Slide

  103. Hint:
    Did we make a new monad?

    View Slide

  104. final case class Kleisli[F[_], -A, B](run: A => F[B])
    Hint:
    Did we make a new monad?

    View Slide

  105. final case class Kleisli[F[_], -A, B](run: A => F[B])
    Hint:
    type ReaderT[F[_], -A, B] = Kleisli[F, A, B]
    val ReaderT = Kleisli
    Did we make a new monad?

    View Slide

  106. Why is ReaderT the perfect fit
    - automatic passing of context

    - get current context for free with Kleisli.ask: Kleisli[F, A, A]
    - change context in a region easily with local

    View Slide

  107. However...
    - we still need integrations (wrapping clients in ones that take headers from context in Kleisli)

    - we need an escape hatch for when we really need MDC

    - we need a cool library that provides a nice API on top of Kleisli...

    View Slide

  108. However...
    - we still need integrations (wrapping clients in ones that take headers from context in Kleisli)

    - we need an escape hatch for when we really need MDC

    - we need a cool library that provides a nice API on top of Kleisli...

    View Slide

  109. Natchez API
    trait Span[F[_]] {
    def put(fields: (String, TraceValue)*): F[Unit]
    def kernel: F[Kernel]
    def span(name: String): Resource[F, Span[F]]
    }
    trait Trace[F[_]] {
    def put(fields: (String, TraceValue)*): F[Unit]
    def kernel: F[Kernel]
    def span[A](name: String)(k: F[A]): F[A]
    }
    object Trace {
    implicit def kleisliInstance[F[_]: Bracket[*[_], Throwable]]: Trace[Kleisli[F, Span[F], *]] = ...
    }

    View Slide

  110. Quick teaser of usage
    def execute(args: Args): F[Result] =
    for {
    _ <- Logger[F].info(show"Executing request $args")
    _ <- Trace[F].span("db")(Timer[F].sleep(100.millis))
    _ <- client.successful(POST(uri"http://localhost:8080/execute"
    _ <- Logger[F].info(show"Executed request $args")
    } yield Result(show"${args.message} finished")

    View Slide

  111. Quick teaser of usage
    type Traced[A] = Kleisli[IO, Span[IO], A]
    val businessLogic = BusinessLogic.instance[Traced]
    def exec(msg: String): IO[Result] =
    Zipkin
    .entryPoint[IO]("client", blocker)
    .use {
    _.root(msg).use {
    businessLogic.execute(Args(msg)).run
    }
    }
    def execute(args: Args): F[Result] =
    for {
    _ <- Logger[F].info(show"Executing request $args")
    _ <- Trace[F].span("db")(Timer[F].sleep(100.millis))
    _ <- client.successful(POST(uri"http://localhost:8080/execute"
    _ <- Logger[F].info(show"Executed request $args")
    } yield Result(show"${args.message} finished")

    View Slide

  112. Quick teaser of usage
    type Traced[A] = Kleisli[IO, Span[IO], A]
    val businessLogic = BusinessLogic.instance[Traced]
    def exec(msg: String): IO[Result] =
    Zipkin
    .entryPoint[IO]("client", blocker)
    .use {
    _.root(msg).use {
    businessLogic.execute(Args(msg)).run
    }
    }
    def execute(args: Args): F[Result] =
    for {
    _ <- Logger[F].info(show"Executing request $args")
    _ <- Trace[F].span("db")(Timer[F].sleep(100.millis))
    _ <- client.successful(POST(uri"http://localhost:8080/execute"
    _ <- Logger[F].info(show"Executed request $args")
    } yield Result(show"${args.message} finished")
    def kleisliTracedClient[F[_]: Bracket[*[_], Throwable]: Defer](
    underlying: Client[F]
    ): Client[Kleisli[F, Span[F], *]] = ...

    View Slide

  113. Quick teaser of usage
    type Traced[A] = Kleisli[IO, Span[IO], A]
    val businessLogic = BusinessLogic.instance[Traced]
    def exec(msg: String): IO[Result] =
    businessLogic.execute(Args(msg)).run
    def execute(args: Args): F[Result] =
    for {
    _ <- Logger[F].info(show"Executing request $args")
    _ <- Trace[F].span("db")(Timer[F].sleep(100.millis))
    _ <- client.successful(POST(uri"http://localhost:8080/execute"
    _ <- Logger[F].info(show"Executed request $args")
    } yield Result(show"${args.message} finished")
    def kleisliTracedClient[F[_]: Bracket[*[_], Throwable]: Defer](
    underlying: Client[F]
    ): Client[Kleisli[F, Span[F], *]] = ...
    Zipkin
    .entryPoint[IO]("client", blocker)
    .use {
    _.root(msg).use {
    }
    }
    This should eventually be delegated to e.g. http4s server wrapper

    View Slide

  114. Integrations?
    As of now Soon In the meantime...
    Using Kamon for reporting?

    View Slide

  115. Does it work?
    Open the examples of this talk and see for yourself (including temporary custom zipkin integration)

    View Slide

  116. Thank you
    blog.kubukoz.com
    @kubukoz
    Slides: speakerdeck.com/kubukoz
    Code: git.io/Jv6mN
    Find me on YouTube! (https://youtube.com/channel/UCBSRCuGz9laxVv0rAnn2O9Q)

    View Slide