Slide 1

Slide 1 text

@ryanlemmer Cape Town Distributed Computation Time and Failure in the Wild FuConf Bangalore 2014 1 Friday 10 October 14

Slide 2

Slide 2 text

* Distributed Programming with Storm + Akka * Distributed + Functional? This talk * Focus on Realtime (not Batch) 2 Friday 10 October 14

Slide 3

Slide 3 text

3 Friday 10 October 14

Slide 4

Slide 4 text

journals DB process each journal for analytics, save to analytics DB Use Case: Analytics search DB analytics DB 4 Friday 10 October 14

Slide 5

Slide 5 text

journals DB search DB for j in journals j1 = enrich(j) j2 = transform(j1) analytics-save(j2) search-index(j2) analytics DB Sequential Execution 5 Friday 10 October 14

Slide 6

Slide 6 text

parallel-for j in journals search DB analytics DB journals DB Parallel Execution j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 6 Friday 10 October 14

Slide 7

Slide 7 text

journals DB search DB analytics DB Distributed Execution j1  =   enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =   enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =   enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 7 Friday 10 October 14

Slide 8

Slide 8 text

“REALTIME” “FAULT TOLERANT” “SCALABLE” runs continuously has a plan for when things go wrong distributed Apache Storm 8 Friday 10 October 14

Slide 9

Slide 9 text

enrich transform analytics-save search-index next- journal Apache Storm j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 9 Friday 10 October 14

Slide 10

Slide 10 text

enrich transform analytics-save search-index next- journal SPOUT BOLT BOLT BOLT BOLT Spouts + Bolts 10 Friday 10 October 14

Slide 11

Slide 11 text

enrich transform analytics-save search-index next- journal [“J323” {‘amt’: 107.43, ...} [“J323” {‘$amt’: 15.70, ...} [“J323” {‘K-ratio’: 42.11, ...} data model: tuples 11 Friday 10 October 14

Slide 12

Slide 12 text

(defspout  client-­‐spout  ["entity"  “values”]    [conf  context  collector]    (let  [next-­‐client  (next-­‐legacy-­‐client)                tuple              [“client”  next-­‐client]]        (spout          (nextTuple  []              (Thread/sleep  100)              (emit-­‐spout!  collector  tuple))          (ack  [id])))) clojure spout 12 Friday 10 October 14

Slide 13

Slide 13 text

(defbolt  transform-­‐client-­‐bolt  ["client"]                {:prepare  true}                [conf  context  collector]        (bolt          (execute  [tuple]              (let  [h  (.getValue  tuple  1)]                  (emit-­‐bolt!  collector  [(transform-­‐tuple  h)])                  (ack!  collector  tuple))))) clojure bolt 13 Friday 10 October 14

Slide 14

Slide 14 text

enrich transform analytics-save search-index next- journal storm topology 14 Friday 10 October 14

Slide 15

Slide 15 text

enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3 ‘p’:  3 ‘p’:  5 ‘p’:  5 storm parallelism 15 Friday 10 October 14

Slide 16

Slide 16 text

enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3 ‘p’:  3 ‘p’:  5 ‘p’:  5 storm grouping ‘shuffle’ ‘shuffle’ ‘shuffle’ 16 Friday 10 October 14

Slide 17

Slide 17 text

enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3 ‘p’:  3 ‘p’:  5 ‘p’:  5 fault tolerance ‘shuffle’ ‘shuffle’ ‘shuffle’ 17 Friday 10 October 14

Slide 18

Slide 18 text

fault tolerance 18 Friday 10 October 14

Slide 19

Slide 19 text

enrich transform analytics-save search-index next- journal x2 side-effects! idempotence 19 Friday 10 October 14

Slide 20

Slide 20 text

enrich transform analytics-save search-index next- journal x1 transactional topologies x1 x1 x1 x1 run-once semantics strong ordering on data processing Storm Trident 20 Friday 10 October 14

Slide 21

Slide 21 text

(queue) (queue) stream computing 21 Friday 10 October 14

Slide 22

Slide 22 text

stream computing * stream processing * realtime analytics * continuous computation * distributed RPC ... 22 Friday 10 October 14

Slide 23

Slide 23 text

streaming soup Apache Storm Apache SAMZA Spark Streaming Nokia Dempsy Esper Streambase Akka Streams Cambrian explosion! 23 Friday 10 October 14

Slide 24

Slide 24 text

lambda architectures new data Batch Processor Realtime Processor merged view 24 Friday 10 October 14

Slide 25

Slide 25 text

“REALTIME” “FAULT TOLERANT” “SCALABLE” runs continuously “let it crash” Actor Model Fault Tolerance scale up (concurrency), scale out (distributed), elastic AKKA 25 Friday 10 October 14

Slide 26

Slide 26 text

class  Account  {        private  var  balance  =  0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} account.add(100) account.add(50) account.rem(40) OO: Single threaded 26 Friday 10 October 14

Slide 27

Slide 27 text

account.add(100) account.add(50) account.rem(40) OO: Multi-threaded class  Account  {        private  var  balance  =  0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} 27 Friday 10 October 14

Slide 28

Slide 28 text

account.add(100) account.add(50) account.rem(40) What if? class  Account  {        private  var  balance  =  0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} 28 Friday 10 October 14

Slide 29

Slide 29 text

class  Account  extends  Actor{        var  balance  =  0        def  receive  =  {            case  Add(amt:Int)  =>                balance  +=  num            case  Rem(amt:  Int)  =>                balance  -­‐=  num}} Actor Messages account ! Add(100) account ! Add(50) account ! Rem(40) MAILBOX 29 Friday 10 October 14

Slide 30

Slide 30 text

enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3 ‘p’:  3 ‘p’:  5 ‘p’:  5 ‘shuffle’ ‘shuffle’ ‘shuffle’ Actor Streaming (naive) 30 Friday 10 October 14

Slide 31

Slide 31 text

class  JournalGen  extends  Actor{    val  router  =    Router(RandomRoutingLogic(),                                                [enrich1,  enrich2,  enrich3])      def  receive  =  {            case  NextJournal(journalQ)  =>                journal  =  journalQ.pop()                router.route(Enrich(journal),  sender()) }} enrich Journal Gen ‘random’ enrich Enrich Actor Streaming (naive) 31 Friday 10 October 14

Slide 32

Slide 32 text

enrich Journal Gen enrich Enrich class  Enrich  extends  Actor{    def  receive  =  {            case  Enrich(journal)  =>                j  =  enrich(journal)                transform  !  j }} Transform Transform Transform ‘random’ Actor Streaming (naive) 32 Friday 10 October 14

Slide 33

Slide 33 text

enrich transform next- journal ‘random’ enrich enrich transform transform analytics-save analytics-save search-index search-index analytics-save analytics-save analytics-save analytics-save search-index search-index search-index search-index ‘round robin’ ‘round robin’ Actor Streaming (naive) 33 Friday 10 October 14

Slide 34

Slide 34 text

ERROR! enrich transform next- journal enrich enrich transform transform analytics-save analytics-save search-index search-index analytics-save analytics-save analytics-save analytics-save search-index search-index search-index search-index Fault tolerance 34 Friday 10 October 14

Slide 35

Slide 35 text

A Supervisor can: RESUME RESTART STOP ESCALATE (FAIL) ERROR! transform analytics-save analytics-save 2 strategies: OneForOne or AllForOne Fault tolerance 35 Friday 10 October 14

Slide 36

Slide 36 text

override  val  supervisorStrategy  =    OneForOneStrategy(maxNrOfRetries  =  10,                                          withinTimeRange  =  1  minute)  {        case  _:  ThisException        =>  Resume        case  _:  ThatException        =>  Restart        case  _:  AnotherException  =>  Stop        case  _:  Exception                =>  Escalate } Fault tolerance 36 Friday 10 October 14

Slide 37

Slide 37 text

OO vs Actor Model Communicate via Methods Communicate via Messages Synchronous “fire and forget” Shared State + Behaviour Local State + Behaviour Local location transparent ask tell 37 Friday 10 October 14

Slide 38

Slide 38 text

* single responsibility Actors * find the “right” granularity for - Messages - Actor Hierarchies - failure zones Designing with Actors 38 Friday 10 October 14

Slide 39

Slide 39 text

* Work Distribution (incl. Streaming) * Domain-driven actor apps - Actors => Entities - Actor Hierarchies => Aggregates - Actor Messages => Domain Events Actors: problem space 39 Friday 10 October 14

Slide 40

Slide 40 text

Storm vs Akka Stream computation Actor Concurrency High level abstraction Low level, more powerful Topology: static Dynamic topology Directed graph 2-way Heavy bolts, spouts Lightweight Actors 40 Friday 10 October 14

Slide 41

Slide 41 text

Reactive Manifesto * interactive * fault tolerant * scalable time for a manifesto! 41 Friday 10 October 14

Slide 42

Slide 42 text

AKKA Streams Reactive Streams JVM Standard for async, distributed, stream processing 42 Friday 10 October 14

Slide 43

Slide 43 text

Time, State, Failure It’s about the Order of events. Minimise enforced order! Time 43 Friday 10 October 14

Slide 44

Slide 44 text

Time, State, Failure It’s about the Order of events. Minimise enforced order! Time It’s Change of State that hurts most. Minimise Change! (immutability) State 44 Friday 10 October 14

Slide 45

Slide 45 text

Time, State, Failure It’s about the Order of events. Minimise enforced order! Time Embrace Failure, plan for it. Failure is a first class citizen. Fault Tolerance State It’s Change of State that hurts most. Minimise Change! (immutability) 45 Friday 10 October 14

Slide 46

Slide 46 text

Distributed+functional Concurrency Oriented Programming Languages * concurrent * fault tolerant * scalable 46 Friday 10 October 14

Slide 47

Slide 47 text

Distributed, the future? CDRT’s “ a data type whose operations commute when they are concurrent. Replicas eventually converge without any complex concurrency control” “A comprehensive study of Convergent and Commutative Replicated Data Types” - Letia et. al. - 2009 “ACID 2.0” 47 Friday 10 October 14

Slide 48

Slide 48 text

@ryanlemmer Cape Town Thank YOU FuConf Bangalore 2014 48 Friday 10 October 14