Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stream Driven Development - Design your data pipeline with Akka Streams

Stream Driven Development - Design your data pipeline with Akka Streams

Reactive Streams are the key to build asynchronous, data-intensive applications with no predetermined data volumes. Akka Streams is a well-known Reactive Streams implementation offering a powerful DSL to build complex, Akka-backed reactive pipelines.

At HomeAway we devised an approach to roll out reactive data pipelines, by combining elements of Domain-Driven Design with the abstraction power of the Akka Streams model.

In this talk we’ll

- briefly introduce Reactive Streams and Akka Streams
- present our approach to pipeline design by example, discussing useful patterns to reason about your streaming application and type-drive its implementation

Stefano Bonetti

November 24, 2017
Tweet

More Decks by Stefano Bonetti

Other Decks in Programming

Transcript

  1. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. OBJECTIVE Quickly roll out

    pipelines which are 12 RESILIENT SCALABLE
  2. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. OBJECTIVE Quickly roll out

    pipelines which are 13 RESILIENT SCALABLE REACTIVE
  3. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. package org.reactivestreams; public interface

    Processor<T, R> extends Subscriber<T>, Publisher<R> { } public interface Publisher<T> { public void subscribe(Subscriber<? super T> s); } public interface Subscriber<T> { public void onSubscribe(Subscription s); public void onNext(T t); public void onError(Throwable t); public void onComplete(); } public interface Subscription { public void request(long n); public void cancel(); } 29
  4. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 30 WHY REACTIVE STREAMS

    • flow control with asynchronous backpressure • interoperability between tools and libraries
  5. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 31 WHY REACTIVE STREAMS

    • flow control with asynchronous backpressure • interoperability between tools and libraries
  6. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. package org.reactivestreams; public interface

    Processor<T, R> extends Subscriber<T>, Publisher<R> { } public interface Publisher<T> { public void subscribe(Subscriber<? super T> s); } public interface Subscriber<T> { public void onSubscribe(Subscription s); public void onNext(T t); public void onError(Throwable t); public void onComplete(); } public interface Subscription { public void request(long n); public void cancel(); } 33
  7. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 36 WHY AKKA STREAMS

    • higher-level DSL • based on akka actor • Scala native with Java and Scala DSLs • 3rd party integrations / connectors
  8. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 37 WHY AKKA STREAMS

    • higher-level DSL • based on akka actor • Scala native with Java and Scala DSLs • 3rd party integrations / connectors
  9. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 38 WHY AKKA STREAMS

    • higher-level DSL • based on akka actor • Scala native with Java and Scala DSLs • 3rd party integrations / connectors
  10. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 39 WHY AKKA STREAMS

    • higher-level DSL • based on akka actor • Scala native with Java and Scala DSLs • 3rd party integrations / connectors
  11. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 40 SOURCE (1 output)

    FAN-IN (n inputs, 1 output) FAN-OUT (1 input, n outputs) RUNNABLEGRAPH (no input or output) STREAMS STAGES FLOW (1 input, 1 output) SINK (1 input) ... CUSTOM
  12. val source: Source[String, NotUsed] = Source.single("World") val flow : Flow[String,

    String, NotUsed] = Flow[String].map(x ⇒ s"Hello, $x!") val sink : Sink[String, Future[Done]] = Sink.foreach[String](println) source.via(flow).runWith(sink) AKKA STREAMS
  13. val source: Source[String, NotUsed] = Source.single("World") val flow : Flow[String,

    String, NotUsed] = Flow[String].map(x ⇒ s"Hello, $x!") val sink : Sink[String, Future[Done]] = Sink.foreach[String](println) source.via(flow).runWith(sink) AKKA STREAMS
  14. val source: Source[String, NotUsed] = Source.single("World") val flow : Flow[String,

    String, NotUsed] = Flow[String].map(x ⇒ s"Hello, $x!") val sink : Sink[String, Future[Done]] = Sink.foreach[String](println) source.via(flow).runWith(sink) AKKA STREAMS
  15. val source: Source[String, NotUsed] = Source.single("World") val flow : Flow[String,

    String, NotUsed] = Flow[String].map(x ⇒ s"Hello, $x!") val sink : Sink[String, Future[Done]] = Sink.foreach[String](println) source.via(flow).runWith(sink) AKKA STREAMS
  16. val source: Source[String, NotUsed] = Source.single("World") val flow : Flow[String,

    String, NotUsed] = Flow[String].map(x ⇒ s"Hello, $x!") val sink : Sink[String, Future[Done]] = Sink.foreach[String](println) val eventualCompletion: Future[Done] = source.via(flow).runWith(sink) eventualCompletion.onComplete { case Success(_) ⇒ println("Stream completed!") case Failure(ex) ⇒ println(s"Stream failed with error $ex") } AKKA STREAMS
  17. def propertySource(config: KafkaConfig): Source[PropertyChange, NotUsed] = { def settings(config: KafkaConfig):

    ConsumerSettings[String, PropertyChange] = ??? val kafkaSrc: Source[ConsumerRecord[String, PropertyChange], Control] = Consumer.plainSource( settings(config), Subscriptions.topics(config.topic) ) kafkaSrc .map(_.value) .mapMaterializedValue { _ ⇒ NotUsed } } SOURCE
  18. def propertySource(config: KafkaConfig): Source[PropertyChange, NotUsed] = { def settings(config: KafkaConfig):

    ConsumerSettings[String, PropertyChange] = ??? val kafkaSrc: Source[ConsumerRecord[String, PropertyChange], Control] = Consumer.plainSource( settings(config), Subscriptions.topics(config.topic) ) kafkaSrc .map(_.value) .mapMaterializedValue { _ ⇒ NotUsed } } SOURCE - REACTIVE KAFKA
  19. def propertySource(config: KafkaConfig): Source[PropertyChange, NotUsed] = { def settings(config: KafkaConfig):

    ConsumerSettings[String, PropertyChange] = ??? val kafkaSrc: Source[ConsumerRecord[String, PropertyChange], Control] = Consumer.plainSource( settings(config), Subscriptions.topics(config.topic) ) kafkaSrc .map(_.value) .mapMaterializedValue { _ ⇒ NotUsed } } SOURCE - REACTIVE KAFKA
  20. def processingFlow(): Flow[PropertyChange, Either[ValidationError, AdwordsChange], NotUsed] = { val service:

    PropertyProcessingService = ??? Flow.fromFunction(service.process) } PROCESSING FLOW
  21. STORING SINK def adwordsSink(config: AdwordsConfig): Sink[AdwordsChange, Future[Done]] = { val

    service: AdwordsService = ??? Flow[AdwordsChange].mapAsync(config.parallelism)(service.store) .toMat(Sink.ignore)(Keep.right) }
  22. ERROR SINK def errorSink(cfg: ErrorConfig): Sink[ValidationError, NotUsed] = { val

    service: AdwordsErrorService = ??? Sink.foreach[ValidationError](service.handle) .mapMaterializedValue(_ ⇒ NotUsed) }
  23. GRAPH def graph(source : Source[PropertyChange, NotUsed], process : Flow[PropertyChange, Either[ValidationError,

    AdwordsChange], NotUsed], store : Sink[AdwordsChange, Future[Done]], errorSink: Sink[AdwordsStreamError, NotUsed]): RunnableGraph[Future[Done]] = { RunnableGraph.fromGraph(GraphDSL.create(store) { implicit builder ⇒ store ⇒ val p = builder.add(PartitionEither.apply()[ValidationError, AdwordsChange]) p.out0 ~> errorSink source ~> process ~> p.in p.out1 ~> store ClosedShape }) }
  24. GRAPH def graph(source : Source[PropertyChange, NotUsed], process : Flow[PropertyChange, Either[ValidationError,

    AdwordsChange], NotUsed], store : Sink[AdwordsChange, Future[Done]], errorSink: Sink[AdwordsStreamError, NotUsed]): RunnableGraph[Future[Done]] = { RunnableGraph.fromGraph(GraphDSL.create(store) { implicit builder ⇒ store ⇒ val p = builder.add(PartitionEither.apply()[ValidationError, AdwordsChange]) p.out0 ~> errorSink source ~> process ~> p.in p.out1 ~> store ClosedShape }) }
  25. def graph(source : Source[PropertyChange, NotUsed], process : Flow[PropertyChange, Either[ValidationError, AdwordsChange],

    NotUsed], store : Sink[AdwordsChange, Future[Done]], errorSink: Sink[AdwordsStreamError, NotUsed]): RunnableGraph[Future[Done]] = { RunnableGraph.fromGraph(GraphDSL.create(store) { implicit builder ⇒ store ⇒ val p = builder.add(PartitionEither.apply()[ValidationError, AdwordsChange]) p.out0 ~> errorSink source ~> process ~> p.in p.out1 ~> store ClosedShape }) } GRAPH
  26. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 73 PropertyProcessingService def process(p:

    PropertyChange): Either[ValidationError, AdwordsChange] PropertyChange AdwordsService def store(p: AdwordsChange): Future[Either[StorageError, Stored]] DOMAIN ValidationError AdwordsErrorService def handle(p: AdwordsError): Unit AdwordsChange
  27. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 74 APPLICATION Failure Handling

    Config Management Materialization Stage Creation
  28. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 75 Application layer Graph

    Events Services Domain layer Repositories Factories Sources Flows Sinks Other Reactive layer
  29. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. 76 Reactive layer Application

    layer Domain layer Reactive layer Application layer Domain layer Reactive layer Application layer Domain layer Reactive layer Application layer Domain layer
  30. © 2017 HOMEAWAY. ALL RIGHTS RESERVED. Stefano Bonetti Software Engineer

    @svezfaz @svez_faz Thank you. homeaway.com/careers