Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Towards Stateful FaaS on Streaming Dataflow

adilakhter
October 08, 2019

Towards Stateful FaaS on Streaming Dataflow

adilakhter

October 08, 2019
Tweet

More Decks by adilakhter

Other Decks in Research

Transcript

  1. Web application DB Operations Configura?on Programming model Credit for icons:

    https://www.flaticon.com 6 Debugging Failure handling Transaction management Fault tolerance Scaling Monitoring Deployment
  2. Web application DB IaaS PaaS SaaS Serverless FaaS Credit for

    icons: https://www.flaticon.com 7 Operations Configura?on Programming model Debugging Failure handling Transaction management Fault tolerance Scaling Monitoring Deployment
  3. < > MS Azure Functions Google Cloud func?ons Credit for

    logos: https://aws.amazon.com/lambda/ https://cloud.google.com/functions/ FaaS 8
  4. FaaS Fn Fn Fn Fn Fn Fn Fn Fn Fn

    Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Cloud storage Credit for icons: https://www.flaticon.com 9 Managed infrastructure Function-based programming model ✅
  5. FaaS Fn Fn Fn Fn Fn Fn Fn Fn Fn

    Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Cloud storage Stateless functions Stateful functions Fn-to-fn calls Coordination ❌ Credit for icons: h9ps://www.fla?con.com 10
  6. 12 Services Architecture (1): Easiest Implem. Order Business Logic Stock

    Business Logic Payments Business Logic DB RPC Call RPC Response RPC Call RPC Response RPC Call RPC Response REST Call REST Call ▪ Perform an order iff there is stock available and the payment is cleared. ▪ Services are stateless ▪ Database does the heavy- liVing ▪ High latency, costly coordina?on calls
  7. 13 ▪ Make state of each service local to the

    business logic. ▪ Services now are stateful ▪ Low-latency access to local state ▪ Service calls s?ll expensive ▪ Not obvious how to scale this out. Services Architecture (2): Embedded State/DB Order DB Business Logic Stock DB Business Logic Payments DB Business Logic REST Call REST Call
  8. 14 ▪ Each message exchange/change to the state goes through

    an event-log. ▪ Services are asynchronous/reac?ve. ▪ If we lose state, we replay the log and rebuild it. ▪ Time-travel debugging, audits, etc. are trivial. Services Architecture (3): Event Sourcing Order DB Business Logic Stock DB Business Logic Payments DB Business Logic REST Call REST Call event-log event-log
  9. 15 Services Architecture (4): Scalable Deployment RPC Calls Subscribe for

    Responses event-log event-log Order 1 Business Logic Order 2 Business Logic Order 3 Business Logic Stock 1 Business Logic Stock 2 Business Logic Payment 1 Business Logic event-log DB DB DB DB DB DB
  10. 16 ▪ Millions of events per second on a couple

    of machines ▪ Consistent snapshots of state, with exactly-once guarantees ▪ We can scale-in/out operators to handle varying workloads Does it ring a bell? OP1 OP4 OP2 OP5 Input Message Queues OP3 Output Message Queue Apache Flink
  11. Stateful FaaS on Streaming Dataflow 17 Time-travel debugging using checkpoints

    and message broker Guaranteed message delivery and exactly-once processing Each operator executes a group of functions that share the same state Operator-local state partitioned on key input for scalability and fault-tolerance
  12. 18 Dataflow graphs could serve as a scalable backend for

    microservices and cloud applica?ons based on stateful func?ons. Dataflow graphs as a backend for Stateful Services
  13. 20

  14. 21

  15. 22

  16. 26

  17. 27

  18. 1. Pure Function (Fn) without State 2. Fn with State

    3. Fn that supports Orchestration Fn ≡ λ
  19. Stateful λ 35 ‑ def sPingFn (p: Ping, ctx: ExecutionContext[Pong],

    state: PingCounter): λ[Pong] case class PingCounter( name: String, i: Long, allRequestString: Seq[String]) extends ManagedState def pingFn(p: Ping, ctx: ExecutionContext[Pong]): λ[Pong]
  20. 36 def sPingFn(p: Ping, ctx: ExecutionContext[Pong], state: PingCounter): λ[Pong] =

    { val message = ”Pong" for { p ← ctx.persist(Pong(p.shardId, s"$message at ${java.time.Instant.now()}")) } yield p }
  21. FnNamespace Every Fn (λ) is part of FnNamespace (can be

    considered as BoundedContext). 37 FnNamespace λ1 λ2 λn class PingServiceFn extends FnNamespace { override def descriptor: FnNamespaceDescriptor = named("PingService") .withQualifiedPaths( register("ping", pingFn _), register(”sPing", sPingFn _), register(”pingStat", pingStatFn _)) // rest of the implementation }
  22. 39 class PingServiceFn extends FnNamespace { type State = PingCounter

    def initialState: State = PingCounter("PingCounter", 0, Seq.empty) // rest of the implementation } FnNamespace λ1 λ2 λn
  23. 41 def sPingFn(p: Ping, ctx: ExecutionContext[Pong], state: PingCounter): λ[Pong] =

    { val message = ”Pong" for { p ← ctx.persist(Pong(p.shardId, s"$message at ${java.time.Instant.now()}")) } yield p } Recall sPingFn
  24. class PingServiceFn extends FnNamespace { // … def onPersist: EventHandler[State]

    = { case (Pong(_, s), PingCounter(n, i, allRequests)) ⇒ PingCounter(n, i + 1, s +: allRequests) case (_, state) ⇒ state // Do nothing with the State } // rest of the implementation } 42 FnNamespace λ1 λ2 λn type EventHandler[S] = PartialFunction[(FnResponse, S), S]
  25. 45 We strongly believe that streaming dataflows can have a

    central place in service-oriented architectures, taking over the execution of acid transactions, ensuring message delivery and processing, in order to perform scalable execution of services.
  26. Orchestrator λ The provided programming abstraction supports calling another Fn

    (λ) from the same FnNamespace or from different FnNamespace available in the system. 51
  27. 53 for { p ← ctx.callFn[PaymentFnRequest, PaymentFnResult]("PaymentFn.reserveCredit", pr) _ ←

    ctx.persist(p) // updating the state s ← ctx.callFn[PrepareStockRequest, StockFnResponse]("StockFn.prepareOrder", ps) _ ← ctx.persist(s) // updating the state } yield orderCreationResponse(r, p, s) OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder
  28. 54

  29. 55 for { p ← ctx.callFn[PaymentFnRequest, PaymentFnResult]("PaymentFn.reserveCredit", pr) _ ←

    ctx.persist(p) // updating the state s ← ctx.callFn[PrepareStockRequest, StockFnResponse]("StockFn.prepareOrder", ps) _ ← ctx.persist(s) // updating the state } yield orderCreationResponse(r, p, s) OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder
  30. $ rho deploy -n OrderFn --parallelism 2 --skip_jar $ rho

    deploy -n PaymentFn --parallelism 2 --skip_jar $ rho deploy -n StockFn --parallelism 2 --skip_jar 59
  31. 60

  32. 61

  33. 62

  34. 63

  35. 65

  36. 68

  37. 71 Online introductory course on stream processing starts on January

    15th, covering all fundamental stream processing concepts (?me, order, windows, joins, etc.). We are using Apache Flink for assignments & include invited talks from industry & academia. Enrollment is open. tudelft.nl/taming-big-data-streams Shameless plug by Asterios Katsifodimos @kasterios
  38. References 1. “Stateful FuncCons as a Service in AcCon”, Adil

    Akhter, Marios Fragkoulis, Asterios Katsifodimos. In the Proceedings of the 45th Interna?onal Conference on Very Large Data Bases (VLDB) 2019. 2. OperaConal Stream Processing: Towards Scalable and Consistent Event-Driven ApplicaCons: Asterios Katsifodimos, Marios Fragkoulis. In the Proceedings of the 22nd Interna?onal Conference on Extending Database Technology (EDBT) 2019. 3. "Benchmarking Distributed Stream Data Processing Systems": Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, Volker Markl. In the Proceedings of the Interna?onal Conference on Data Engineering (ICDE) 2018. 4. “Efficient Window AggregaCon with General Stream Slicing”: Jonas Traub, Philipp M. Grulich, Alejandro Rodriguez Cuellar, Sebas?an Breß, Asterios Katsifodimos, Tilmann Rabl and Volker Markl. In the Proceedings of the 22nd Interna?onal Conference on Extending Database Technology (EDBT) 2019. 74
  39. 75