Slide 1

Slide 1 text

Akka Streams Introduction Ivan Yurchenko

Slide 2

Slide 2 text

2 ● Ivan Yurchenko. ● Currently work at Zalando in Helsinki. ● Have been working in several teams: mobile backend, search, domain knowledge service. ● Mostly use Scala now. ● Contacts: ○ [email protected] ○ https://ivanyu.me/ ○ https://linkedin.com/in/ivanyurchenko/ ○ https://twitter.com/ivan0yu ABOUT ME

Slide 3

Slide 3 text

15 countries 21 million active customers 200 million visits per month ~3.64 billion € revenue 2016 13.000+ employees 100+ nationalities Tech HQ in Berlin 1800 employees in Tech AT A GLANCE: EUROPE’S LARGEST ONLINE FASHION RETAILER Visit us: jobs.zalando.com

Slide 4

Slide 4 text

4 ZALANDO HELSINKI TECH HUB Zalando Helsinki site was opened in August 2015, moved to new office in August 2016. BUILDING OUR ECOMMERCE PLATFORM AWS, Microservices, Scala, Android and iOS 108 employees Autonomous delivery teams working with modern technologies 12 31 Nationalities Our office is located in KAMPPI

Slide 5

Slide 5 text

5 MOTIVATION ● Often data processing is a pipeline of stages ● Might be complex, with asynchronous stages of different speed, I/O, complex in topology (merges, broadcasts, etc.) ● This implies buffering, queues, congestion control, etc. and might be difficult ● Actor systems are technically good for this, but quite low-level => bug-prone and lots of boilerplate ● High-level programming libraries (Rx*), frameworks (Apache Camel) and systems (Apache Storm, Twitter Heron, Apache Flink, etc.) exist

Slide 6

Slide 6 text

6 AKKA STREAMS ● A way to build arbitrary complex type-safe data processing pipelines ● Pipelines consist of stages ● Stages are composable and reusable ● Stages might be complex, consist of smaller sub-pipelines ● Stages can be executed asynchronously (in different ExecutionContexts) ● Not distributed [yet] ● New: compatible with Java 9’s java.util.concurrent.Flow

Slide 7

Slide 7 text

7 AKKA STREAMS BASICS In general: data processing is passing data through arbitrary complex graph of transformations/actions Most common: Source → Flow → … → Flow → Sink

Slide 8

Slide 8 text

8 AKKA STREAMS BASICS val helloWorldStream1: RunnableGraph[NotUsed] = Source.single("Hello world") .via(Flow[String].map(s => s.toUpperCase())) .to(Sink.foreach(println)) val helloWorldStream2: RunnableGraph[NotUsed] = Source.single("Hello world") .map(s => s.toUpperCase()) .to(Sink.foreach(println)) ←1 ←2 ←3 ←5 ↙ 4

Slide 9

Slide 9 text

9 MATERIALIZATION Materializer -- ActorMaterializer implicit val actorSystem = ActorSystem("akka-streams-example") implicit val materializer = ActorMaterializer() helloWorldStream.run() HELLO WORLD interface implementation

Slide 10

Slide 10 text

10 LOTS OF STAGES OUT OF THE BOX Source: fromIterator, single, repeat, cycle, tick, fromFuture, unfold, empty, failed, actorPublisher, actorRef, queue, fromPath, ... Sink: head, headOption, last, lastOption, ignore, cancelled, seq, foreach, foreachParallel, queue, fold, reduce, actorRef, actorRefWithAck, actorSubscriber, toPath, ... Flow: map, mapAsync, mapConcant, statefulMapConcat, filter, grouped, sliding, scan, scanAsync, fold, foldAsync, take, takeWhile, drop, dropWhile, recover, recoverWith, throttle, intersperse, limit, delay, buffer, monitor, ...

Slide 11

Slide 11 text

11 COMPOSITION AND REUSABILITY

Slide 12

Slide 12 text

12 MATERIALIZED VALUES ● It’s something that we get when a stream is materialized by Materializer ● Not the result of a stream (a stream might even not have a result as such) ● Each stage creates its own materialized value ● It’s up to us which one we want to have at the end

Slide 13

Slide 13 text

13 MATERIALIZED VALUES NotUsed – materialized value, but not really useful val helloWorldStream1: RunnableGraph[NotUsed] = Source.single("Hello world") .via(Flow[String].map(s => s.toUpperCase())) .to(Sink.foreach(println)) val materializedValue: NotUsed = helloWorldStream1.run() Future[Done] – much more useful val helloWorldStream2: RunnableGraph[Future[Done]] = Source.single("Hello world") .map(s => s.toUpperCase()) .toMat(Sink.foreach(println))(Keep.right) val doneF: Future[Done] = helloWorldStream2.run() doneF.onComplete { … } ←1 ←2 ←3 ←4

Slide 14

Slide 14 text

14 MATERIALIZED VALUES IN COMPOSITION

Slide 15

Slide 15 text

15 KILL SWITCHES val stream: RunnableGraph[(UniqueKillSwitch, Future[Done])] = Source.single("Hello world") .map(s => s.toUpperCase()) .viaMat(KillSwitches.single)(Keep.right) .toMat(Sink.foreach(println))(Keep.both) val (killSwitch, doneF): (UniqueKillSwitch,Future[Done]) = stream.run() killSwitch.shutdown() // or killSwitch.abort(new Exception("Exception from KillSwitch")) ←1 ←2 ←3 ←4 ←5

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

17 BACK PRESSURE ● Different speeds of stages (produces/consumer) causes problems ● We know how to deal with these problems ● Back pressure – a mechanism for the consumer to signal to the producer about capacity for incoming data

Slide 18

Slide 18 text

18 BACK PRESSURE

Slide 19

Slide 19 text

19 PRACTICAL EXAMPLE – CONSUMING FROM NAKADI ● Send a single HTTP GET request ● Receive an infinite HTTP response ● One line = one event batch – need to parse ● Process batches

Slide 20

Slide 20 text

20 PRACTICAL EXAMPLE – CONSUMING FROM NAKADI val http = Http(actorSystem) val nakadiConnectionFlow = http.outgoingConnectionHttps("https://nakadi-url.com", 443) val getRequest = HttpRequest(HttpMethods.GET, "/") val eventBatchSource: Source[EventBatch, NotUsed] = // The stream start with a single request object ... Source.single(getRequest) // ... that goes through a connection (i.e. is sent to the server) .via(nakadiConnectionFlow) .flatMapConcat { case response @ HttpResponse(StatusCodes.OK, _, _, _) => response.entity.dataBytes // Decompress deflate-compressed bytes. .via(Deflate.decoderFlow) // Coalesce chunks into a line. .via(Framing.delimiter(ByteString("\n"), Int.MaxValue)) // Deserialize JSON. .map(bs => Json.read[EventBatch](bs.utf8String)) // process erroneous responses } eventBatchSource.map(...).to(...) // process batches

Slide 21

Slide 21 text

21 GraphDSL

Slide 22

Slide 22 text

22 GraphDSL import akka.stream.scaladsl.GraphDSL.Implicits._ RunnableGraph.fromGraph(GraphDSL.create() { implicit builder => val A: Outlet[Int] = builder.add(Source.single(0)).out val B: UniformFanOutShape[Int, Int] = builder.add(Broadcast[Int](2)) val C: UniformFanInShape[Int, Int] = builder.add(Merge[Int](2)) val D: FlowShape[Int, Int] = builder.add(Flow[Int].map(_ + 1)) val E: UniformFanOutShape[Int, Int] = builder.add(Balance[Int](2)) val F: UniformFanInShape[Int, Int] = builder.add(Merge[Int](2)) val G: Inlet[Any] = builder.add(Sink.foreach(println)).in C <~ F A ~> B ~> C ~> F B ~> D ~> E ~> F E ~> G ClosedShape })

Slide 23

Slide 23 text

23 class LongCounter extends ActorPublisher[Long] { private var counter = 0L override def receive: Receive = { case ActorPublisherMessage.Request(n) => for (_ <- 0 to n) { counter += 1 onNext(counter) } case ActorPublisherMessage.Cancel => context.stop(self) } } INTEGRATION WITH AKKA ACTORS ● An actor can be a Source or a Sink ● The back pressure protocol – normal actor messages ←1 ←2 ←3

Slide 24

Slide 24 text

24 CONCLUSION ● Akka Streams – a way to build arbitrary complex type-safe data processing pipelines ● Complex inside, but the interface is reasonably simple ● Gives control over execution, including back pressure and asynchronous execution ● Don’t misuse it, might be not suitable for the task

Slide 25

Slide 25 text

25 WHERE TO GET INFORMATION ● The official documentation http://doc.akka.io/docs/akka/current/scala/stream/index.html ● Akka team blog http://blog.akka.io/

Slide 26

Slide 26 text

26 BUILT ON TOP OF AKKA STREAMS ● Akka HTTP – HTTP client and server http://doc.akka.io/docs/akka-http/current/scala.html ● Alpakka – enterprise integration patterns (like Apache Camel) (WIP) http://developer.lightbend.com/docs/alpakka/current/

Slide 27

Slide 27 text

27 BLOG POST VERSION OF THIS PRESENTATION My blog: https://ivanyu.me/blog/2016/12/12/about-akka-streams/ Zalando blog: https://tech.zalando.com/blog/about-akka-streams/

Slide 28

Slide 28 text

QUESTIONS?

Slide 29

Slide 29 text

29 More about Zalando? FOLLOW US: #Zelsinki #ZalandoTech LinkedIn: Zalando SE Facebook & Instagram: @insidezalando Twitter: @ZalandoTech Tech Blog: https://jobs.zalando.com/tech/blog/ CAREERS: https://jobs.zalando.com