Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Computation: dealing with Time and Failure in the wild

ryan lemmer
October 10, 2014

Distributed Computation: dealing with Time and Failure in the wild

FuConf, 2014, Bangalore

ryan lemmer

October 10, 2014
Tweet

More Decks by ryan lemmer

Other Decks in Technology

Transcript

  1. @ryanlemmer Cape Town Distributed Computation Time and Failure in the

    Wild FuConf Bangalore 2014 1 Friday 10 October 14
  2. * Distributed Programming with Storm + Akka * Distributed +

    Functional? This talk * Focus on Realtime (not Batch) 2 Friday 10 October 14
  3. journals DB process each journal for analytics, save to analytics

    DB Use Case: Analytics search DB analytics DB 4 Friday 10 October 14
  4. journals DB search DB for j in journals j1 =

    enrich(j) j2 = transform(j1) analytics-save(j2) search-index(j2) analytics DB Sequential Execution 5 Friday 10 October 14
  5. parallel-for j in journals search DB analytics DB journals DB

    Parallel Execution j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 6 Friday 10 October 14
  6. journals DB search DB analytics DB Distributed Execution j1  =

      enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =   enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) j1  =   enrich( j1  =   enrich( j1  =   enrich( j1  =  enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 7 Friday 10 October 14
  7. “REALTIME” “FAULT TOLERANT” “SCALABLE” runs continuously has a plan for

    when things go wrong distributed Apache Storm 8 Friday 10 October 14
  8. enrich transform analytics-save search-index next- journal Apache Storm j1  =

     enrich(j) j2  =  transform(j1) analytics-­‐save(j2) search-­‐index(j2) 9 Friday 10 October 14
  9. enrich transform analytics-save search-index next- journal [“J323” {‘amt’: 107.43, ...}

    [“J323” {‘$amt’: 15.70, ...} [“J323” {‘K-ratio’: 42.11, ...} data model: tuples 11 Friday 10 October 14
  10. (defspout  client-­‐spout  ["entity"  “values”]    [conf  context  collector]    (let

     [next-­‐client  (next-­‐legacy-­‐client)                tuple              [“client”  next-­‐client]]        (spout          (nextTuple  []              (Thread/sleep  100)              (emit-­‐spout!  collector  tuple))          (ack  [id])))) clojure spout 12 Friday 10 October 14
  11. (defbolt  transform-­‐client-­‐bolt  ["client"]              

     {:prepare  true}                [conf  context  collector]        (bolt          (execute  [tuple]              (let  [h  (.getValue  tuple  1)]                  (emit-­‐bolt!  collector  [(transform-­‐tuple  h)])                  (ack!  collector  tuple))))) clojure bolt 13 Friday 10 October 14
  12. enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3

    ‘p’:  3 ‘p’:  5 ‘p’:  5 storm parallelism 15 Friday 10 October 14
  13. enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3

    ‘p’:  3 ‘p’:  5 ‘p’:  5 storm grouping ‘shuffle’ ‘shuffle’ ‘shuffle’ 16 Friday 10 October 14
  14. enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3

    ‘p’:  3 ‘p’:  5 ‘p’:  5 fault tolerance ‘shuffle’ ‘shuffle’ ‘shuffle’ 17 Friday 10 October 14
  15. enrich transform analytics-save search-index next- journal x1 transactional topologies x1

    x1 x1 x1 run-once semantics strong ordering on data processing Storm Trident 20 Friday 10 October 14
  16. stream computing * stream processing * realtime analytics * continuous

    computation * distributed RPC ... 22 Friday 10 October 14
  17. streaming soup Apache Storm Apache SAMZA Spark Streaming Nokia Dempsy

    Esper Streambase Akka Streams Cambrian explosion! 23 Friday 10 October 14
  18. “REALTIME” “FAULT TOLERANT” “SCALABLE” runs continuously “let it crash” Actor

    Model Fault Tolerance scale up (concurrency), scale out (distributed), elastic AKKA 25 Friday 10 October 14
  19. class  Account  {        private  var  balance  =

     0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} account.add(100) account.add(50) account.rem(40) OO: Single threaded 26 Friday 10 October 14
  20. account.add(100) account.add(50) account.rem(40) OO: Multi-threaded class  Account  {    

       private  var  balance  =  0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} 27 Friday 10 October 14
  21. account.add(100) account.add(50) account.rem(40) What if? class  Account  {    

       private  var  balance  =  0        def  add(num:  Int):  Int  =  {            balance  +=  num}        def  rem(num:  Int):  Int  =  {            balance  -­‐=  num}} 28 Friday 10 October 14
  22. class  Account  extends  Actor{        var  balance  =

     0        def  receive  =  {            case  Add(amt:Int)  =>                balance  +=  num            case  Rem(amt:  Int)  =>                balance  -­‐=  num}} Actor Messages account ! Add(100) account ! Add(50) account ! Rem(40) MAILBOX 29 Friday 10 October 14
  23. enrich transform analytics-save search-index next- journal ‘p’:  1 ‘p’:  3

    ‘p’:  3 ‘p’:  5 ‘p’:  5 ‘shuffle’ ‘shuffle’ ‘shuffle’ Actor Streaming (naive) 30 Friday 10 October 14
  24. class  JournalGen  extends  Actor{    val  router  =    Router(RandomRoutingLogic(),

                                                   [enrich1,  enrich2,  enrich3])      def  receive  =  {            case  NextJournal(journalQ)  =>                journal  =  journalQ.pop()                router.route(Enrich(journal),  sender()) }} enrich Journal Gen ‘random’ enrich Enrich Actor Streaming (naive) 31 Friday 10 October 14
  25. enrich Journal Gen enrich Enrich class  Enrich  extends  Actor{  

     def  receive  =  {            case  Enrich(journal)  =>                j  =  enrich(journal)                transform  !  j }} Transform Transform Transform ‘random’ Actor Streaming (naive) 32 Friday 10 October 14
  26. enrich transform next- journal ‘random’ enrich enrich transform transform analytics-save

    analytics-save search-index search-index analytics-save analytics-save analytics-save analytics-save search-index search-index search-index search-index ‘round robin’ ‘round robin’ Actor Streaming (naive) 33 Friday 10 October 14
  27. ERROR! enrich transform next- journal enrich enrich transform transform analytics-save

    analytics-save search-index search-index analytics-save analytics-save analytics-save analytics-save search-index search-index search-index search-index Fault tolerance 34 Friday 10 October 14
  28. A Supervisor can: RESUME RESTART STOP ESCALATE (FAIL) ERROR! transform

    analytics-save analytics-save 2 strategies: OneForOne or AllForOne Fault tolerance 35 Friday 10 October 14
  29. override  val  supervisorStrategy  =    OneForOneStrategy(maxNrOfRetries  =  10,    

                                         withinTimeRange  =  1  minute)  {        case  _:  ThisException        =>  Resume        case  _:  ThatException        =>  Restart        case  _:  AnotherException  =>  Stop        case  _:  Exception                =>  Escalate } Fault tolerance 36 Friday 10 October 14
  30. OO vs Actor Model Communicate via Methods Communicate via Messages

    Synchronous “fire and forget” Shared State + Behaviour Local State + Behaviour Local location transparent ask tell 37 Friday 10 October 14
  31. * single responsibility Actors * find the “right” granularity for

    - Messages - Actor Hierarchies - failure zones Designing with Actors 38 Friday 10 October 14
  32. * Work Distribution (incl. Streaming) * Domain-driven actor apps -

    Actors => Entities - Actor Hierarchies => Aggregates - Actor Messages => Domain Events Actors: problem space 39 Friday 10 October 14
  33. Storm vs Akka Stream computation Actor Concurrency High level abstraction

    Low level, more powerful Topology: static Dynamic topology Directed graph 2-way Heavy bolts, spouts Lightweight Actors 40 Friday 10 October 14
  34. Time, State, Failure It’s about the Order of events. Minimise

    enforced order! Time 43 Friday 10 October 14
  35. Time, State, Failure It’s about the Order of events. Minimise

    enforced order! Time It’s Change of State that hurts most. Minimise Change! (immutability) State 44 Friday 10 October 14
  36. Time, State, Failure It’s about the Order of events. Minimise

    enforced order! Time Embrace Failure, plan for it. Failure is a first class citizen. Fault Tolerance State It’s Change of State that hurts most. Minimise Change! (immutability) 45 Friday 10 October 14
  37. Distributed, the future? CDRT’s “ a data type whose operations

    commute when they are concurrent. Replicas eventually converge without any complex concurrency control” “A comprehensive study of Convergent and Commutative Replicated Data Types” - Letia et. al. - 2009 “ACID 2.0” 47 Friday 10 October 14