Slide 1

Slide 1 text

Berlin • 08 October 2019 Towards Stateful FaaS Flink Forward Marios Fragkoulis & Adil Akhter

Slide 2

Slide 2 text

Outline

Slide 3

Slide 3 text

● Motivation ● Rho - FaaS beyond Stateless ● Research Direction ● Conclusion 3

Slide 4

Slide 4 text

Motivation

Slide 5

Slide 5 text

Web application DB Credit for icons: h9ps://www.fla?con.com 5

Slide 6

Slide 6 text

Web application DB Operations Configura?on Programming model Credit for icons: https://www.flaticon.com 6 Debugging Failure handling Transaction management Fault tolerance Scaling Monitoring Deployment

Slide 7

Slide 7 text

Web application DB IaaS PaaS SaaS Serverless FaaS Credit for icons: https://www.flaticon.com 7 Operations Configura?on Programming model Debugging Failure handling Transaction management Fault tolerance Scaling Monitoring Deployment

Slide 8

Slide 8 text

< > MS Azure Functions Google Cloud func?ons Credit for logos: https://aws.amazon.com/lambda/ https://cloud.google.com/functions/ FaaS 8

Slide 9

Slide 9 text

FaaS Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Cloud storage Credit for icons: https://www.flaticon.com 9 Managed infrastructure Function-based programming model ✅

Slide 10

Slide 10 text

FaaS Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Fn Cloud storage Stateless functions Stateful functions Fn-to-fn calls Coordination ❌ Credit for icons: h9ps://www.fla?con.com 10

Slide 11

Slide 11 text

If not FaaS then what? 11

Slide 12

Slide 12 text

12 Services Architecture (1): Easiest Implem. Order Business Logic Stock Business Logic Payments Business Logic DB RPC Call RPC Response RPC Call RPC Response RPC Call RPC Response REST Call REST Call ▪ Perform an order iff there is stock available and the payment is cleared. ▪ Services are stateless ▪ Database does the heavy- liVing ▪ High latency, costly coordina?on calls

Slide 13

Slide 13 text

13 ▪ Make state of each service local to the business logic. ▪ Services now are stateful ▪ Low-latency access to local state ▪ Service calls s?ll expensive ▪ Not obvious how to scale this out. Services Architecture (2): Embedded State/DB Order DB Business Logic Stock DB Business Logic Payments DB Business Logic REST Call REST Call

Slide 14

Slide 14 text

14 ▪ Each message exchange/change to the state goes through an event-log. ▪ Services are asynchronous/reac?ve. ▪ If we lose state, we replay the log and rebuild it. ▪ Time-travel debugging, audits, etc. are trivial. Services Architecture (3): Event Sourcing Order DB Business Logic Stock DB Business Logic Payments DB Business Logic REST Call REST Call event-log event-log

Slide 15

Slide 15 text

15 Services Architecture (4): Scalable Deployment RPC Calls Subscribe for Responses event-log event-log Order 1 Business Logic Order 2 Business Logic Order 3 Business Logic Stock 1 Business Logic Stock 2 Business Logic Payment 1 Business Logic event-log DB DB DB DB DB DB

Slide 16

Slide 16 text

16 ▪ Millions of events per second on a couple of machines ▪ Consistent snapshots of state, with exactly-once guarantees ▪ We can scale-in/out operators to handle varying workloads Does it ring a bell? OP1 OP4 OP2 OP5 Input Message Queues OP3 Output Message Queue Apache Flink

Slide 17

Slide 17 text

Stateful FaaS on Streaming Dataflow 17 Time-travel debugging using checkpoints and message broker Guaranteed message delivery and exactly-once processing Each operator executes a group of functions that share the same state Operator-local state partitioned on key input for scalability and fault-tolerance

Slide 18

Slide 18 text

18 Dataflow graphs could serve as a scalable backend for microservices and cloud applica?ons based on stateful func?ons. Dataflow graphs as a backend for Stateful Services

Slide 19

Slide 19 text

Recap 19

Slide 20

Slide 20 text

20

Slide 21

Slide 21 text

21

Slide 22

Slide 22 text

22

Slide 23

Slide 23 text

Built using largely ad-hoc, time- consuming, low-level programming.

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

26

Slide 26

Slide 26 text

27

Slide 27

Slide 27 text

ρ (rho) Function-as-a-Service beyond Stateless

Slide 28

Slide 28 text

Programming Model

Slide 29

Slide 29 text

Domain Codomain ⇒ λ I O ::

Slide 30

Slide 30 text

1. Pure Function (Fn) without State 2. Fn with State 3. Fn that supports Orchestration Fn ≡ λ

Slide 31

Slide 31 text

32 def pingFn (p: Ping): Pong = Pong(id, "Pong!") Stateless λ

Slide 32

Slide 32 text

34 def pingFn (p: Ping, ctx: ExecutionContext[Pong]): λ[Pong] = ctx.returnWith(Pong(id, "Pong!"))

Slide 33

Slide 33 text

Stateful λ 35 ‑ def sPingFn (p: Ping, ctx: ExecutionContext[Pong], state: PingCounter): λ[Pong] case class PingCounter( name: String, i: Long, allRequestString: Seq[String]) extends ManagedState def pingFn(p: Ping, ctx: ExecutionContext[Pong]): λ[Pong]

Slide 34

Slide 34 text

36 def sPingFn(p: Ping, ctx: ExecutionContext[Pong], state: PingCounter): λ[Pong] = { val message = ”Pong" for { p ← ctx.persist(Pong(p.shardId, s"$message at ${java.time.Instant.now()}")) } yield p }

Slide 35

Slide 35 text

FnNamespace Every Fn (λ) is part of FnNamespace (can be considered as BoundedContext). 37 FnNamespace λ1 λ2 λn class PingServiceFn extends FnNamespace { override def descriptor: FnNamespaceDescriptor = named("PingService") .withQualifiedPaths( register("ping", pingFn _), register(”sPing", sPingFn _), register(”pingStat", pingStatFn _)) // rest of the implementation }

Slide 36

Slide 36 text

State Management 38 FnNamespace λ1 λ2 λn

Slide 37

Slide 37 text

39 class PingServiceFn extends FnNamespace { type State = PingCounter def initialState: State = PingCounter("PingCounter", 0, Seq.empty) // rest of the implementation } FnNamespace λ1 λ2 λn

Slide 38

Slide 38 text

40 FnNamespace λ1 λ2 λn

Slide 39

Slide 39 text

41 def sPingFn(p: Ping, ctx: ExecutionContext[Pong], state: PingCounter): λ[Pong] = { val message = ”Pong" for { p ← ctx.persist(Pong(p.shardId, s"$message at ${java.time.Instant.now()}")) } yield p } Recall sPingFn

Slide 40

Slide 40 text

class PingServiceFn extends FnNamespace { // … def onPersist: EventHandler[State] = { case (Pong(_, s), PingCounter(n, i, allRequests)) ⇒ PingCounter(n, i + 1, s +: allRequests) case (_, state) ⇒ state // Do nothing with the State } // rest of the implementation } 42 FnNamespace λ1 λ2 λn type EventHandler[S] = PartialFunction[(FnResponse, S), S]

Slide 41

Slide 41 text

Putting it all together 43

Slide 42

Slide 42 text

Execution Semantics

Slide 43

Slide 43 text

45 We strongly believe that streaming dataflows can have a central place in service-oriented architectures, taking over the execution of acid transactions, ensuring message delivery and processing, in order to perform scalable execution of services.

Slide 44

Slide 44 text

46 Compiler

Slide 45

Slide 45 text

47 FnNamespace1 FnNamespace2 FnNamespacen ReqQ1 ResQ1 ReqQ2 ResQ2 ResQn ReqQn

Slide 46

Slide 46 text

48 CLI Gateway FnNamspace 1 FnNamspace 2 FnNamspace N Events

Slide 47

Slide 47 text

Recall PingServiceFn 49

Slide 48

Slide 48 text

50 1 2 3

Slide 49

Slide 49 text

Orchestrator λ The provided programming abstraction supports calling another Fn (λ) from the same FnNamespace or from different FnNamespace available in the system. 51

Slide 50

Slide 50 text

52 OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder

Slide 51

Slide 51 text

53 for { p ← ctx.callFn[PaymentFnRequest, PaymentFnResult]("PaymentFn.reserveCredit", pr) _ ← ctx.persist(p) // updating the state s ← ctx.callFn[PrepareStockRequest, StockFnResponse]("StockFn.prepareOrder", ps) _ ← ctx.persist(s) // updating the state } yield orderCreationResponse(r, p, s) OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder

Slide 52

Slide 52 text

54

Slide 53

Slide 53 text

55 for { p ← ctx.callFn[PaymentFnRequest, PaymentFnResult]("PaymentFn.reserveCredit", pr) _ ← ctx.persist(p) // updating the state s ← ctx.callFn[PrepareStockRequest, StockFnResponse]("StockFn.prepareOrder", ps) _ ← ctx.persist(s) // updating the state } yield orderCreationResponse(r, p, s) OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder

Slide 54

Slide 54 text

56 FnNamespace1 FnNamespace2 FnNamespacen ReqQ1 ResQ1 ReqQ2 ResQ2 ResQn ReqQn Recall

Slide 55

Slide 55 text

57 Rho in Action

Slide 56

Slide 56 text

58 OrderService PaymentSerivce StockSerivce reserveBalance prepareOrder

Slide 57

Slide 57 text

$ rho deploy -n OrderFn --parallelism 2 --skip_jar $ rho deploy -n PaymentFn --parallelism 2 --skip_jar $ rho deploy -n StockFn --parallelism 2 --skip_jar 59

Slide 58

Slide 58 text

60

Slide 59

Slide 59 text

61

Slide 60

Slide 60 text

62

Slide 61

Slide 61 text

63

Slide 62

Slide 62 text

Performance

Slide 63

Slide 63 text

65

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

67 What’s Next?

Slide 66

Slide 66 text

68

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

Thanks

Slide 69

Slide 69 text

71 Online introductory course on stream processing starts on January 15th, covering all fundamental stream processing concepts (?me, order, windows, joins, etc.). We are using Apache Flink for assignments & include invited talks from industry & academia. Enrollment is open. tudelft.nl/taming-big-data-streams Shameless plug by Asterios Katsifodimos @kasterios

Slide 70

Slide 70 text

Questions ?

Slide 71

Slide 71 text

73 Adil Akhter Tech Lead at ING Amsterdam, The Netherlands http://coyoneda.xyz adilakhter

Slide 72

Slide 72 text

References 1. “Stateful FuncCons as a Service in AcCon”, Adil Akhter, Marios Fragkoulis, Asterios Katsifodimos. In the Proceedings of the 45th Interna?onal Conference on Very Large Data Bases (VLDB) 2019. 2. OperaConal Stream Processing: Towards Scalable and Consistent Event-Driven ApplicaCons: Asterios Katsifodimos, Marios Fragkoulis. In the Proceedings of the 22nd Interna?onal Conference on Extending Database Technology (EDBT) 2019. 3. "Benchmarking Distributed Stream Data Processing Systems": Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, Volker Markl. In the Proceedings of the Interna?onal Conference on Data Engineering (ICDE) 2018. 4. “Efficient Window AggregaCon with General Stream Slicing”: Jonas Traub, Philipp M. Grulich, Alejandro Rodriguez Cuellar, Sebas?an Breß, Asterios Katsifodimos, Tilmann Rabl and Volker Markl. In the Proceedings of the 22nd Interna?onal Conference on Extending Database Technology (EDBT) 2019. 74

Slide 73

Slide 73 text

75