Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Connecting the dots - building and structuring a functional application in Scala

Connecting the dots - building and structuring a functional application in Scala

Functional programming relies on building programs from orthogonal, composable blocks. That's likely one of the reasons why full-blown application frameworks haven't gained much traction in the functional ecosystem.

However, we still need to structure our code and wire up our applications in a way that lets us keep them modular, testable and simply pleasant to work with - in this talk, we will learn how to do just that!

Using an application that integrates with several third-party services to process data in a streaming fashion, and expose its results to downstream clients, we will walk through the architecture design and testing setup for a functional app on the Typelevel stack.

Jakub Kozłowski

May 05, 2021
Tweet

More Decks by Jakub Kozłowski

Other Decks in Programming

Transcript

  1. CONNECTING THE DOTS
    BUILDING AND STRUCTURING A FUNCTIONAL APPLICATION IN SCALA
    JAKUB KOZŁOWSKI, DISNEY STREAMING
    YOW! LAMBDA JAM 2021
    Photo by Kumiko SHIMIZU on Unsplash

    View Slide

  2. PROBLEM STATEMENT

    View Slide

  3. PROBLEM STATEMENT
    We want to build an application

    View Slide

  4. PROBLEM STATEMENT
    We want to build an application
    There are some sources of data (databases, APIs, event streams)

    View Slide

  5. PROBLEM STATEMENT
    We want to build an application
    There are some sources of data (databases, APIs, event streams)
    We need to serve HTTP traffic

    View Slide

  6. PROBLEM STATEMENT
    We want to build an application
    There are some sources of data (databases, APIs, event streams)
    We need to serve HTTP traffic
    Some things need to run in the background additionally

    View Slide

  7. PROBLEM STATEMENT
    We want to build an application
    There are some sources of data (databases, APIs, event streams)
    We need to serve HTTP traffic
    Some things need to run in the background additionally
    We want to do it with FP

    View Slide

  8. DEPENDENCY GRAPH
    TYPICAL APPLICATION

    View Slide

  9. DEPENDENCY GRAPH
    def database: Database
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Server
    def backgroundProcesses(logic: BusinessLogic): Processes

    View Slide

  10. DEPENDENCY GRAPH
    def database: Database
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Server
    def backgroundProcesses(logic: BusinessLogic): Processes
    def build: (Server, Processes) = {
    val logic = businessLogic(database)
    (server(logic), backgroundProcesses(logic))
    }

    View Slide

  11. DEPENDENCY GRAPH
    This could be us, but the real world exists...
    def database: Database
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Server
    def backgroundProcesses(logic: BusinessLogic): Processes
    def build: (Server, Processes) = {
    val logic = businessLogic(database)
    (server(logic), backgroundProcesses(logic))
    }

    View Slide

  12. RESOURCES.

    View Slide

  13. RESOURCES.
    But we can just kill our application when it quits, right?

    View Slide

  14. CONNECTION POOLS?

    View Slide

  15. CONNECTION POOLS?
    But we can use try-finally, right?

    View Slide

  16. def getConnection(db: Database): Connection
    def returnConnection(conn: Connection): Unit
    def doWork(conn: Connection): Result
    TRY-FINALLY

    View Slide

  17. def getConnection(db: Database): Connection
    def returnConnection(conn: Connection): Unit
    def doWork(conn: Connection): Result
    TRY-FINALLY
    val result = {
    val c = getConnection(db)
    try doWork(c)
    finally returnConnection(c)
    }

    View Slide

  18. def getConnection(db: Database): IO[Connection]
    def returnConnection(conn: Connection): IO[Unit]
    def doWork(conn: Connection): IO[Result]
    TRY-FINALLY IN FP
    val result: IO[Result] =
    getConnection(db).bracket { c !=>
    doWork(c)
    }(returnConnection)

    View Slide

  19. MORE RESOURCES?
    Can't keep nesting bracket forever

    View Slide

  20. MORE RESOURCES?
    Can't keep nesting bracket forever
    ABSTRACTION?
    How to hide details?

    View Slide

  21. RESOURCE DATA STRUCTURE

    View Slide

  22. RUNNING A RESOURCE?

    View Slide

  23. RUNNING A RESOURCE?

    View Slide

  24. COMPOSITION?

    View Slide

  25. WHY AM I TALKING ABOUT THIS?
    def database: Database
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Server
    def backgroundProcesses(logic: BusinessLogic): Processes

    View Slide

  26. WHY AM I TALKING ABOUT THIS?
    def database: Resource[Database]
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Resource[Server]
    def backgroundProcesses(logic: BusinessLogic): Resource[Processes]

    View Slide

  27. WHY AM I TALKING ABOUT THIS?
    def database: Resource[Database]
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Resource[Server]
    def backgroundProcesses(logic: BusinessLogic): Resource[Processes]
    def build: Resource[(Server, Processes)] = database.flatMap { db !=>
    val logic = businessLogic(db)
    server(logic).flatMap { srv !=>
    backgroundProcesses(logic).map(p !=> (srv, p))
    }
    }

    View Slide

  28. WHY AM I TALKING ABOUT THIS?
    def database: Resource[Database]
    def businessLogic(db: Database): BusinessLogic
    def server(logic: BusinessLogic): Resource[Server]
    def backgroundProcesses(logic: BusinessLogic): Resource[Processes]
    def build: Resource[(Server, Processes)] =
    for {
    db !<- database
    logic = businessLogic(db)
    srv !<- server(logic)
    processes !<- backgroundProcesses(logic)
    } yield (srv, processes)

    View Slide

  29. HOW IS A BACKGROUND PROCESS A RESOURCE?
    Watch this space: yt.kubukoz.com -> "Background processing in functional Scala" playlist

    View Slide

  30. DEPENDENCY GRAPH AS A RESOURCE

    View Slide

  31. ALGEBRAS / CAPABILITY TRAITS

    View Slide

  32. ALGEBRAS / CAPABILITY TRAITS
    Tagless Final style

    View Slide

  33. ALGEBRAS / CAPABILITY TRAITS
    Tagless Final style
    Interfaces parameterised by an effect

    View Slide

  34. ALGEBRAS / CAPABILITY TRAITS
    Tagless Final style
    Interfaces parameterised by an effect
    Capability traits - lawless type
    classes

    View Slide

  35. RULES OF THUMB

    View Slide

  36. RULES OF THUMB
    Prefer capability traits (Files[F], Network[F], Console[F]) over Sync/Async

    View Slide

  37. RULES OF THUMB
    Prefer capability traits (Files[F], Network[F], Console[F]) over Sync/Async
    Implicit or explicit?

    View Slide

  38. RULES OF THUMB
    Prefer capability traits (Files[F], Network[F], Console[F]) over Sync/Async
    Implicit or explicit?
    - 1 instance per type: implicit definition, pass implicitly

    View Slide

  39. RULES OF THUMB
    Prefer capability traits (Files[F], Network[F], Console[F]) over Sync/Async
    Implicit or explicit?
    - 1 instance per type: implicit definition, pass implicitly
    - has possible test instance: explicit definition, pass implicitly

    View Slide

  40. RULES OF THUMB
    Prefer capability traits (Files[F], Network[F], Console[F]) over Sync/Async
    Implicit or explicit?
    - 1 instance per type: implicit definition, pass implicitly
    - has possible test instance: explicit definition, pass implicitly
    - multiple instances in app: all explicit

    View Slide

  41. IMAGE PROCESSING APP
    CASE STUDY

    View Slide

  42. IMAGE PROCESSING APP
    CASE STUDY
    Project Goals

    View Slide

  43. IMAGE PROCESSING APP
    CASE STUDY
    Project Goals
    Search images from a datasource by the text on them (OCR)

    View Slide

  44. IMAGE PROCESSING APP
    CASE STUDY
    Project Goals
    Search images from a datasource by the text on them (OCR)
    Live OCR is too slow, so we'll index ahead of time

    View Slide

  45. IMAGE PROCESSING APP
    CASE STUDY
    Project Goals
    Search images from a datasource by the text on them (OCR)
    Live OCR is too slow, so we'll index ahead of time
    github.com/kubukoz/dropbox-demo

    View Slide

  46. DATA FLOW
    (LINEAR)

    View Slide

  47. TANGENT: HEXAGONAL
    ARCHITECTURE
    Hexagonal Architecture by Cth027, licensed under CC BY-SA 4.0

    View Slide

  48. TANGENT: HEXAGONAL
    ARCHITECTURE
    Hexagonal Architecture by Cth027, licensed under CC BY-SA 4.0
    Or... just sensible architecture.

    View Slide

  49. TANGENT: HEXAGONAL
    ARCHITECTURE
    Hexagonal Architecture by Cth027, licensed under CC BY-SA 4.0
    Or... just sensible architecture.
    Keep vendor/implementation-specific details hidden
    and away from core logic

    View Slide

  50. TANGENT: HEXAGONAL
    ARCHITECTURE
    Hexagonal Architecture by Cth027, licensed under CC BY-SA 4.0
    Or... just sensible architecture.
    Keep vendor/implementation-specific details hidden
    and away from core logic
    Only talk to these via adapters with a simple API

    View Slide

  51. TANGENT: HEXAGONAL
    ARCHITECTURE
    Hexagonal Architecture by Cth027, licensed under CC BY-SA 4.0
    Jakub Nabrdalik - Hexagonal Architecture in practice
    https://www.youtube.com/watch?v=sOaS83Ir8Ck
    Or... just sensible architecture.
    Keep vendor/implementation-specific details hidden
    and away from core logic
    Only talk to these via adapters with a simple API

    View Slide

  52. DEPENDENCY
    GRAPH

    View Slide

  53. PROJECT STRUCTURE
    shared - contains common vocabulary
    Used by adapters and core logic
    imagesource, ocr, indexer - modules
    root - contains core logic + http module
    Standard sbt pattern for "main" sources

    View Slide

  54. OCR MODULE

    View Slide

  55. OCR MODULE
    ProcessRunner - capability trait for running system processes

    View Slide

  56. OCR MODULE
    ProcessRunner - capability trait for running system processes
    Tesseract - runs a Tesseract process

    View Slide

  57. OCR MODULE
    ProcessRunner - capability trait for running system processes
    Tesseract - runs a Tesseract process
    OCR - wraps Tesseract and specifies config options (languages)

    View Slide

  58. OCR MODULE
    ProcessRunner - capability trait for running system processes
    Tesseract - runs a Tesseract process
    OCR - wraps Tesseract and specifies config options (languages)
    TestOCRInstances - contains test fakes for OCR for usage in tests
    of higher-level components (processes)

    View Slide

  59. PROCESS RUNNER
    package com.kubukoz.process
    trait ProcessRunner[F[_]] {
    def run(program: List[String]): Resource[F, ProcessRunner.Running[F]]
    }
    object ProcessRunner {
    def apply[F[_]](implicit F: ProcessRunner[F]): ProcessRunner[F] = F
    implicit def instance[F[_]: Async]: ProcessRunner[F] = !!...
    }

    View Slide

  60. TESSERACT
    package com.kubukoz.ocr.tesseract
    private[ocr] trait Tesseract[F[_]] {
    def decode(input: fs2.Stream[F, Byte], languages: List[String]): F[String]
    }
    object Tesseract {
    def apply[F[_]](implicit F: Tesseract[F]): Tesseract[F] = F
    def instance[F[_]: ProcessRunner: Logger: Concurrent](implicit SC: fs2.Compiler[F, F]): Tesseract[F] = !!...
    }

    View Slide

  61. OCR

    View Slide

  62. package com.kubukoz.ocr
    trait OCR[F[_]] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText]
    }
    object OCR {
    def apply[F[_]](implicit F: OCR[F]): OCR[F] = F
    }
    OCR

    View Slide

  63. package com.kubukoz.ocr
    trait OCR[F[_]] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText]
    }
    object OCR {
    def apply[F[_]](implicit F: OCR[F]): OCR[F] = F
    }
    OCR
    final case class Config(languages: List[String])
    def config[F[_]]: ConfigValue[F, Config] = !!...

    View Slide

  64. package com.kubukoz.ocr
    trait OCR[F[_]] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText]
    }
    object OCR {
    def apply[F[_]](implicit F: OCR[F]): OCR[F] = F
    }
    OCR
    final case class Config(languages: List[String])
    def config[F[_]]: ConfigValue[F, Config] = !!...
    private[ocr] def tesseractInstance[F[_]: Tesseract: Functor](config: Config): OCR[F] = new OCR[F] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText] =
    Tesseract[F].decode(file, config.languages).map(DecodedText(_))
    }

    View Slide

  65. package com.kubukoz.ocr
    trait OCR[F[_]] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText]
    }
    object OCR {
    def apply[F[_]](implicit F: OCR[F]): OCR[F] = F
    }
    OCR
    final case class Config(languages: List[String])
    def config[F[_]]: ConfigValue[F, Config] = !!...
    private[ocr] def tesseractInstance[F[_]: Tesseract: Functor](config: Config): OCR[F] = new OCR[F] {
    def decodeText(file: fs2.Stream[F, Byte]): F[DecodedText] =
    Tesseract[F].decode(file, config.languages).map(DecodedText(_))
    }
    def module[F[_]: Concurrent: ProcessRunner: Logger](config: Config): OCR[F] = {
    implicit val tesseract = Tesseract.instance[F]
    OCR.tesseractInstance[F](config)
    }

    View Slide

  66. package com.kubukoz.ocr
    object TestOCRInstances {
    !// decodeText("hello".getBytes) !== "hello"
    def simple[F[_]: Functor](implicit SC: fs2.Compiler[F, F]): OCR[F] =
    _.through(fs2.text.utf8Decode[F]).compile.string.map(DecodedText(_))
    }
    TEST INSTANCE

    View Slide

  67. WIRING IT ALL UP

    View Slide

  68. WIRING IT ALL UP

    View Slide

  69. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }

    View Slide

  70. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)

    View Slide

  71. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)
    def run[F[_]: Async: Logger](config: Config): Resource[F, Server] =
    for {
    implicit0(client: Client[F]) !<- HttpClient.instance[F]
    }

    View Slide

  72. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)
    def run[F[_]: Async: Logger](config: Config): Resource[F, Server] =
    for {
    implicit0(client: Client[F]) !<- HttpClient.instance[F]
    }
    implicit0(imageSource: ImageSource[F]) !<- ImageSource.module[F](config.imageSource).toResource
    implicit0(indexer: Indexer[F]) !<- Indexer.module[F](config.indexer)
    implicit0(ocr: OCR[F]) !<- OCR.module[F](config.ocr).pure[Resource[F, *]]

    View Slide

  73. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)
    def run[F[_]: Async: Logger](config: Config): Resource[F, Server] =
    for {
    implicit0(client: Client[F]) !<- HttpClient.instance[F]
    }
    implicit0(imageSource: ImageSource[F]) !<- ImageSource.module[F](config.imageSource).toResource
    implicit0(indexer: Indexer[F]) !<- Indexer.module[F](config.indexer)
    implicit0(ocr: OCR[F]) !<- OCR.module[F](config.ocr).pure[Resource[F, *]]
    processQueue !<- ProcessQueue.instance(config.processQueue)

    View Slide

  74. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)
    def run[F[_]: Async: Logger](config: Config): Resource[F, Server] =
    for {
    implicit0(client: Client[F]) !<- HttpClient.instance[F]
    }
    implicit0(imageSource: ImageSource[F]) !<- ImageSource.module[F](config.imageSource).toResource
    implicit0(indexer: Indexer[F]) !<- Indexer.module[F](config.indexer)
    implicit0(ocr: OCR[F]) !<- OCR.module[F](config.ocr).pure[Resource[F, *]]
    processQueue !<- ProcessQueue.instance(config.processQueue)
    implicit0(index: Index[F]) !<- Index.instance[F](processQueue).pure[Resource[F, *]]
    implicit0(download: Download[F]) !<- Download.instance[F].pure[Resource[F, *]]
    implicit0(search: Search[F]) !<- Search.instance[F](serverInfo.get).pure[Resource[F, *]]

    View Slide

  75. object Application {
    final case class Config(
    indexer: Indexer.Config,
    imageSource: ImageSource.Config,
    processQueue: ProcessQueue.Config,
    ocr: OCR.Config,
    http: HttpServer.Config,
    )
    }
    def config[F[_]: ApplicativeThrow]: ConfigValue[F, Config] = (
    Indexer.config[F],
    ImageSource.config[F],
    ProcessQueue.config[F],
    OCR.config[F],
    HttpServer.config[F],
    ).parMapN(Config)
    def run[F[_]: Async: Logger](config: Config): Resource[F, Server] =
    for {
    implicit0(client: Client[F]) !<- HttpClient.instance[F]
    }
    implicit0(imageSource: ImageSource[F]) !<- ImageSource.module[F](config.imageSource).toResource
    implicit0(indexer: Indexer[F]) !<- Indexer.module[F](config.indexer)
    implicit0(ocr: OCR[F]) !<- OCR.module[F](config.ocr).pure[Resource[F, *]]
    processQueue !<- ProcessQueue.instance(config.processQueue)
    implicit0(index: Index[F]) !<- Index.instance[F](processQueue).pure[Resource[F, *]]
    implicit0(download: Download[F]) !<- Download.instance[F].pure[Resource[F, *]]
    implicit0(search: Search[F]) !<- Search.instance[F](serverInfo.get).pure[Resource[F, *]]
    server !<- HttpServer.instance[F](config.http)
    yield server

    View Slide

  76. A COUPLE GUIDELINES
    TESTING

    View Slide

  77. A COUPLE GUIDELINES
    TESTING
    Test the contract, not the implementation

    View Slide

  78. A COUPLE GUIDELINES
    TESTING
    Test the contract, not the implementation
    Prefer fakes over mocks/stubs

    View Slide

  79. A COUPLE GUIDELINES
    TESTING
    Test the contract, not the implementation
    Prefer fakes over mocks/stubs
    Test your fakes with the same suite as the real things

    View Slide

  80. TESTING

    View Slide

  81. TESTING
    index.schedule(Path("/hello"))

    View Slide

  82. TESTING
    index.schedule(Path("/hello"))
    val file = fakeFile("hello world", "/hello/world")
    imageSource.uploadFile(file.fileData) !*>

    View Slide

  83. TESTING
    index.schedule(Path("/hello"))
    val file = fakeFile("hello world", "/hello/world")
    imageSource.uploadFile(file.fileData) !*>
    !*>
    indexer.search("hello").compile.toList

    View Slide

  84. TESTING
    index.schedule(Path("/hello"))
    val file = fakeFile("hello world", "/hello/world")
    imageSource.uploadFile(file.fileData) !*>
    !*>
    indexer.search("hello").compile.toList
    {
    }.map { results !=>
    expect(results !== List(file.fileDocument))
    }

    View Slide

  85. TESTING
    Await blogpost for more ;)
    index.schedule(Path("/hello"))
    val file = fakeFile("hello world", "/hello/world")
    imageSource.uploadFile(file.fileData) !*>
    !*>
    indexer.search("hello").compile.toList
    {
    }.map { results !=>
    expect(results !== List(file.fileDocument))
    }

    View Slide

  86. SUMMARY

    View Slide

  87. TIPS

    View Slide

  88. TIPS
    Use Resource + IO for stateful dependencies

    View Slide

  89. TIPS
    Use Resource + IO for stateful dependencies
    Define clear responsibilities for modules

    View Slide

  90. TIPS
    Use Resource + IO for stateful dependencies
    Define clear responsibilities for modules
    Design for replacement

    View Slide

  91. TIPS
    Use Resource + IO for stateful dependencies
    Define clear responsibilities for modules
    Design for replacement
    Look for abstractions

    View Slide

  92. TIPS
    Use Resource + IO for stateful dependencies
    Define clear responsibilities for modules
    Design for replacement
    Look for abstractions
    Prototype early

    View Slide

  93. TIPS
    Use Resource + IO for stateful dependencies
    Define clear responsibilities for modules
    Design for replacement
    Look for abstractions
    Prototype early
    Draw some diagrams, see if you have too many arrows ;)

    View Slide

  94. LEARN MORE
    Check out the sources: github.com/kubukoz/dropbox-demo
    Read "Practical FP in Scala" by Gabriel Volpe
    github.com/scala-steward-org/scala-steward
    github.com/branchtalk-io/backend
    github.com/kubukoz/spotify-next
    github.com/pitgull/pitgull leanpub.com/pfp-scala

    View Slide

  95. THANK YOU
    📰 blog.kubukoz.com
    🐦 @kubukoz
    Slides: speakerdeck.com/kubukoz
    Code: git.io/JONEj
    Find me on YouTube! (yt.kubukoz.com)

    View Slide