Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A dive into Akka Streams: from the basics to a real-world scenario

A dive into Akka Streams: from the basics to a real-world scenario

Reactive streaming is becoming the best approach to handle data flows across asynchronous boundaries. Here, we present the implementation of a real-world application based on Akka Streams. After reviewing the basics, we will discuss the development of a data processing pipeline that collects real-time sensor data and sends it to a Kinesis stream. There are various possible point of failures in this architecture. What should happen when Kinesis is unavailable? If the data flow is not handled in the correct way, some information may get lost. Akka Streams are the tools that enabled us to build a reliable processing logic for the pipeline that avoids data losses and maximizes the robustness of the entire system.

Gioia Ballin

May 14, 2016
Tweet

Other Decks in Programming

Transcript

  1. A dive into Akka Streams From the basics to a

    real-world scenario Gioia Ballin, Simone Fabiano - SW developers
  2. Who are these guys? Gioia Ballin Twitter / @GioiaBallin Simone

    Fabiano Twitter / @mone_tpatpc R&D Data Analysis Microservices Infrastructure Web Blog
  3. The Collector Service: two months ago... AMAZON KINESIS INBOX ACTOR

    TCP Kinesis Client INBOX ACTOR Kinesis Client INBOX ACTOR Kinesis Client TCP TCP chunks chunks chunks events events events put put put
  4. Welcome Akka Streams! Backpressure on the network + Easy fit

    for stream paradigm Typed flow & Increased Readability
  5. val source = Source(1 to 42) val flow = Flow[Int].map(_

    + 1) val sink = Sink.foreach(println) val graph = source.via(flow).to(sink) graph.run()
  6. Flows are collections (almost) OK map, filter, fold, … No

    flatMap! (there’s mapConcat though) Extras mapAsync, mapAsyncUnordered, buffer..
  7. path("data") { post { extractRequest { request => val source

    = request.entity.dataBytes val flow = processing() val sink = Sink.ignore source.via(flow).runWith(sink) … } }
  8. class PersistentBuffer[A] (...) extends GraphStage[FlowShape[A, A]] { val in =

    Inlet[A]("PersistentBuffer.in") val out = Outlet[A]("PersistentBuffer.out") override val shape = FlowShape.of(in,out) When the available stages are not enough, write your own
  9. Custom stages: State override def createLogic( attr: Attributes ): GraphStageLogic

    = new GraphStageLogic(shape) { var state: StageState[A] = initialState[A]
  10. val cb = getAsyncCallback[Try[A]] { case Success(elements:A) => state =

    state.copy(...) ... } queue.getCurrentQueue .onComplete(cb.invoke)
  11. Custom Stages: ports setHandler(in, new InHandler { override def onPush()

    = { val element = grab(in) pull(in) ... } ... }) setHandler(out, new OutHandler { override def onPull(): Unit = { push(out, something) } ... })
  12. Custom Stages: Start and stop override def postStop(): Unit =

    { ... } override def preStart(): Unit = { ... }
  13. Automatic Fusion and Async Stages A stream is handled by

    1 thread NO CROSS-THREAD COMMUNICATIONS NO PARALLELISM FASTER CODE! SLOWER CODE! A boundary will split your stream on different threads source .via(doSomething).async .via(doSomething).async .runWith(Sink.foreach(println))
  14. Materialized Values val sink: Sink[Any, Future[Done]] = Sink.ignore .runWith(sink) is

    a shortcut for .toMat(sink)(Keep.Right).run STREAM RUN Obtain a materialized value per each stage
  15. Materialized Values Example: TestKit val myFlow = Flow[String].map { v

    => v.take(1) } val (pub, sub) = TestSource.probe[String] .via(myFlow) .toMat(TestSink.probe[Any])(Keep.both) .run() sub.request(1) pub.sendNext("Gathering") sub.expectNext("G")
  16. Stream Ending: Supervisioning RESUME RESET STOP > drop failing elem

    > keep going > drop failing elem > reset stage state > keep going fail all the things!