Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Akka Streams. Introduction

Akka Streams. Introduction

A talk about Akka Streams given in Zalando Finland on 09.03.2017.

Ivan Yurchenko

March 09, 2017
Tweet

More Decks by Ivan Yurchenko

Other Decks in Programming

Transcript

  1. Akka Streams
    Introduction
    Ivan Yurchenko
    @ivan0yu

    View Slide

  2. ● Often data processing is a pipeline of stages
    ● Might be complex, with asynchronous stages of different speed, I/O,
    complex in topology (merges, broadcasts, etc.)
    ● This implies buffering, queues, congestion control, etc. and might be
    difficult
    ● Actor systems are technically good for this, but quite low-level =>
    bug-prone and lots of boilerplate
    ● High-level programming libraries (Rx*), frameworks (Apache Camel)
    and systems (Apache Storm, Twitter Heron, Apache Flink, etc.) exist
    Motivation

    View Slide

  3. Akka Streams
    ● A way to build arbitrary complex type-safe data processing
    pipelines
    ● Pipelines consist of stages
    ● Stages are composable and reusable
    ● Stages might be complex, consist of smaller sub-pipelines
    ● Stages can be executed asynchronously (in different
    ExecutionContexts)
    ● Not distributed [yet]

    View Slide

  4. Akka Streams basics
    In general: data processing is passing data through arbitrary
    complex graph of transformations/actions
    Most common:
    Source → Flow → … → Flow → Sink

    View Slide

  5. Akka Streams basics
    val helloWorldStream1: RunnableGraph[NotUsed] =
    Source.single("Hello world")
    .via(Flow[String].map(s => s.toUpperCase()))
    .to(Sink.foreach(println))
    val helloWorldStream2: RunnableGraph[NotUsed] =
    Source.single("Hello world")
    .map(s => s.toUpperCase())
    .to(Sink.foreach(println))

    View Slide

  6. Materialization
    Materializer -- ActorMaterializer
    implicit val actorSystem = ActorSystem("akka-streams-example")
    implicit val materializer = ActorMaterializer()
    helloWorldStream.run()
    HELLO WORLD
    interface implementation

    View Slide

  7. Lots of stages out of the box
    Source: fromIterator, single, repeat, cycle, tick, fromFuture,
    unfold, empty, failed, actorPublisher, actorRef, queue,
    fromPath, ...
    Sink: head, headOption, last, lastOption, ignore, cancelled, seq,
    foreach, foreachParallel, queue, fold, reduce, actorRef,
    actorRefWithAck, actorSubscriber, toPath, ...
    Flow: map, mapAsync, mapConcant, statefulMapConcat, filter,
    grouped, sliding, scan, scanAsync, fold, foldAsync, take,
    takeWhile, drop, dropWhile, recover, recoverWith, throttle,
    intersperse, limit, delay, buffer, monitor, ...

    View Slide

  8. Composition and reusability

    View Slide

  9. Materialized values
    ● It’s something that we get when a stream is materialized
    by Materializer
    ● Not the result of a stream (a stream might even not have a
    result as such)
    ● Each stage creates its own materialized value
    ● It’s up to us which one we want to have at the end

    View Slide

  10. Materialized values
    NotUsed – materialized value, but not really useful
    val helloWorldStream1: RunnableGraph[NotUsed] =
    Source.single("Hello world")
    .via(Flow[String].map(s => s.toUpperCase()))
    .to(Sink.foreach(println))
    val materializedValue: NotUsed = helloWorldStream1.run()
    Future[Done] – much more useful
    val helloWorldStream2: RunnableGraph[Future[Done]] =
    Source.single("Hello world")
    .map(s => s.toUpperCase())
    .toMat(Sink.foreach(println))(Keep.right)
    val doneF: Future[Done] = helloWorldStream2.run()
    doneF.onComplete { … }

    View Slide

  11. Materialized values in composition

    View Slide

  12. Kill switches
    val stream:
    RunnableGraph[(UniqueKillSwitch, Future[Done])] =
    Source.single("Hello world")
    .map(s => s.toUpperCase())
    .viaMat(KillSwitches.single)(Keep.right)
    .toMat(Sink.foreach(println))(Keep.both)
    val (killSwitch, doneF): (UniqueKillSwitch,Future[Done]) =
    stream.run()
    killSwitch.shutdown()
    // or
    killSwitch.abort(new Exception("Exception from KillSwitch"))

    View Slide

  13. View Slide

  14. ● Different speeds of stages (produces/consumer) causes
    problems
    ● We know how to deal with these problems
    ● Back pressure – a mechanism for the consumer to signal
    to the producer about capacity for incoming data
    Back pressure

    View Slide

  15. Back pressure

    View Slide

  16. Practical example – consuming from Nakadi
    ● Send a single HTTP GET request
    ● Receive an infinite HTTP response
    ● One line = one event batch – need to parse
    ● Process batches

    View Slide

  17. Practical example – consuming from Nakadi
    val http = Http(actorSystem)
    val nakadiConnectionFlow = http.outgoingConnectionHttps("https://nakadi-url.com", 443)
    val getRequest = HttpRequest(HttpMethods.GET, "/")
    val eventBatchSource: Source[EventBatch, NotUsed] =
    // The stream start with a single request object ...
    Source.single(getRequest)
    // ... that goes through a connection (i.e. is sent to the server)
    .via(nakadiConnectionFlow)
    .flatMapConcat {
    case response @ HttpResponse(StatusCodes.OK, _, _, _) =>
    response.entity.dataBytes
    // Decompress deflate-compressed bytes.
    .via(Deflate.decoderFlow)
    // Coalesce chunks into a line.
    .via(Framing.delimiter(ByteString("\n"), Int.MaxValue))
    // Deserialize JSON.
    .map(bs => Json.read[EventBatch](bs.utf8String))
    // process erroneous responses
    }
    eventBatchSource.map(...).to(...) // process batches

    View Slide

  18. GraphDSL

    View Slide

  19. GraphDSL
    import akka.stream.scaladsl.GraphDSL.Implicits._
    RunnableGraph.fromGraph(GraphDSL.create() { implicit builder =>
    val A: Outlet[Int] = builder.add(Source.single(0)).out
    val B: UniformFanOutShape[Int, Int] = builder.add(Broadcast[Int](2))
    val C: UniformFanInShape[Int, Int] = builder.add(Merge[Int](2))
    val D: FlowShape[Int, Int] = builder.add(Flow[Int].map(_ + 1))
    val E: UniformFanOutShape[Int, Int] = builder.add(Balance[Int](2))
    val F: UniformFanInShape[Int, Int] = builder.add(Merge[Int](2))
    val G: Inlet[Any] = builder.add(Sink.foreach(println)).in
    C <~ F
    A ~> B ~> C ~> F
    B ~> D ~> E ~> F
    E ~> G
    ClosedShape
    })

    View Slide

  20. Integration with Akka actors
    ● An actor can be a Source or a Sink
    ● The back pressure protocol – normal actor messages
    class LongCounter extends ActorPublisher[Long] {
    private var counter = 0L
    override def receive: Receive = {
    case ActorPublisherMessage.Request(n) =>
    for (_ <- 0 to n) {
    counter += 1
    onNext(counter)
    }
    case ActorPublisherMessage.Cancel =>
    context.stop(self)
    }
    }

    View Slide

  21. Conclusion
    ● Akka Streams – a way to build arbitrary complex type-safe data
    processing pipelines
    ● Complex inside, but the interface is reasonably simple
    ● Gives control over execution, including back pressure and
    asynchronous execution
    ● Don’t misuse it, might be not suitable for the task

    View Slide

  22. Where to get information
    ● The official documentation
    http://doc.akka.io/docs/akka/current/scala/stream/index.html
    ● Akka team blog
    http://blog.akka.io/

    View Slide

  23. Built on top of Akka Streams
    ● Akka HTTP – HTTP client and server
    http://doc.akka.io/docs/akka-http/current/scala.html
    ● Alpakka – enterprise integration patterns (like Apache Camel) (WIP)
    http://developer.lightbend.com/docs/alpakka/current/

    View Slide

  24. Blog post version of this presentation
    About Akka Streams
    https://tech.zalando.com/blog/about-akka-streams/

    View Slide

  25. Questions?

    View Slide