Upgrade to Pro — share decks privately, control downloads, hide ads and more …

We All Make Distributed Systems - wroc_love.rb 2017

Ba17945a06aac247b06548d5afe341e8?s=47 mrzasa
March 19, 2017

We All Make Distributed Systems - wroc_love.rb 2017

Presentation for my talk at wroc_love.rb 2017 conference.

Ba17945a06aac247b06548d5afe341e8?s=128

mrzasa

March 19, 2017
Tweet

Transcript

  1. WE ALL MAKE WE ALL MAKE WE ALL MAKE WE

    ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE WE ALL MAKE DISTRIBUTED SYSTEMS DISTRIBUTED SYSTEMS DISTRIBUTED SYSTEMS DISTRIBUTED SYSTEMS DISTRIBUTED SYSTEMS DISTRIBUTED SYSTEMS MACIEJ RZĄSA @mjrzasa
  2. SERIOUSLY? DISTRIBUTED SYSTEMS? source: goodfreephotos.com

  3. ME Software Engineer @ TextMaster main interests writing software that

    matters self-organising teams distributed systems (occasionally) knowledge sharing Rzeszów Ruby User Group ( ) Rzeszów University of Technology rrug.pl
  4. AGENDA Why? - reasons to care about this talk What?

    - limitations of distributed system How? - case study
  5. A collection of independent computers that appears to its users

    as a single coherent system Andrew Tannenbaum A distributed system is one where a machine I’ve never heard of can cause my program to fail. Leslie Lamport DISTRIBUTED SYSTEM
  6. SIMPLE AS WEBAPP?

  7. SIMPLE AS WEBAPP?

  8. A collection of independent computers that appears to its users

    as a single coherent system Andrew Tannenbaum DISTRIBUTED SYSTEM
  9. SO WHAT? source: pinterest.com

  10. THREE GUARANTEES Consistency Availability Partition tolerance

  11. single-copy consistency not the same as ACID- consistency example: write

    on android, read on web CONSISTENCY
  12. every non-failing node responds meaningfully example: I can work on

    any client (web/android/iOS/WP) AVAILABILITY
  13. system works even with some messages missing network failure: loss

    of packets offline-mode PARTITION TOLERACE If all you have is a timeout, everything looks like a partition - @nkeywal
  14. you cannot have all three evidence: try to save offline

    and read on a different client THREE GUARANTEES: CAP THEOREM
  15. consistency and availability no partition handling single host network is

    unreliable CA: I FEEL LUCKY
  16. consistency and partition tolerance consistency guaranteed no offline-mode, app works

    only with network connection limited features possible (read-only) after reconnection: fetching data (one-way sync) convenient for developers CP: ORDNUNG MUSS SEIN
  17. availability and partition tolerance app usable all the time offline

    mode two-way sync required profitable for the client AP: WORK AROUND THE CLOCK
  18. CAP CA - I feel lucky CP - Ordnung muss

    sein AP - Work around the clock The choice of availability over consistency is a business choice, not a technical one. - @coda
  19. CASE STUDY CASE STUDY CASE STUDY CASE STUDY CASE STUDY

    CASE STUDY CASE STUDY CASE STUDY CASE STUDY CASE STUDY source: wikipedia.org
  20. APPLICATION OVERVIEW domain: recycling mobile app for field workers +

    web panel for admins offline: payments, product catalog trade-offs: validation, consistency challenges: synchronization, conflict resolution
  21. SYNCHRONIZATION fetching big data set two-way sync concurrent edits retransmission

  22. SYNC SYNC SYNC SYNC SYNC SYNC SYNC SYNC SYNC SYNC

    ALL THE DATA! ALL THE DATA! ALL THE DATA! ALL THE DATA! ALL THE DATA! ALL THE DATA! source: memegenerator.com
  23. SYNC — ONE WAY (CP) pricing, published ~1/week, bulk of

    them at the same time (Monday) version 1.0: fetch all changes at once growing data size: timeouts, memory limitations on android obvious solution: pagination
  24. PAGINATION GET /items? page=1 GET /items? since=1234 to=3456 GET /items?

    since=1234 page_size=3
  25. SYNC — TWO WAY (AP) tickets (a kind of shopping

    cart) payments offline (!) every client (android) creates its own tickets and adds ticket items tickets available on the server and set to other mobile clients
  26. CLIENT SERVER SYNC — TWO WAY (AP) def sync() timestamp

    = get_last_sync_timestamp client_changes = choose_updated(timestamp) server_changes = send(client_changes, timestamp) # wait... store(server_changes) set_sync_timestamp(Time.now) end def sync(client_changes, timestamp) server_changes = choose_updated(timestamp) store(client_changes) send(server_changes) end
  27. TICKET EDIT cancelling tickets (purchase) on the server changed tickets

    set to the client FAIL! both sides edits a ticket, changes lost solution: sync of status changes, not whole tickets
  28. ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES

    NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY ONE DOES NOT SIMPLY SYNC MUTABLE STATE SYNC MUTABLE STATE SYNC MUTABLE STATE SYNC MUTABLE STATE SYNC MUTABLE STATE SYNC MUTABLE STATE source: youtube.com
  29. RETRANSMISSION Client sends payment records Server saves records and updates

    account data What if the client loses connection?
  30. RETRANSMISSION def sync(data) create_transaction(data) account.redeem(data.amount) end PUT /transactions { amount:

    1000 } def sync(data) transaction = find_transaction(data.uuid) return if transaction create_transaction(data) account.redeem(data.amount) end PUT /transactions { uuid: "2db1ec4c-...", amount: 1000 }
  31. CLIENT-SIDE IDENTIFIERS AUTOINCREMENT is not very useful in distributed environment

    ;-) UUID (v4) really low risk of collision, may be generated on the client-side sync can be repeated multiple times helpful on failures (of the network, server, client) effect: idempotence e4043456-b29e-4d80-afaf-4d65246f1d36
  32. LESSON LEARNED

  33. SYNCHRONIZATION PATTERNS let client decide on data scope and size

    exclude received changes in two-way sync sync immutable rather than mutable identify data on client to assure idempotence GET /items? since=1234 page_size=3
  34. WEB AS DISTRIBUTED SYSTEM limitations of distributed systems are applicable

    to web/mobile apps as well the network is unreliable CAP: consistency - availability - partition tolerance: you cannot have all three CP and AP approaches may be mixed in one app synchronize immutable values and apply them to mutable objects (events vs entities) idempotency matters source of the icons on my diarams: http:/ /www.flaticon.com
  35. REFERENCES : data safety test in various distributed systems :

    really? ;-) source of the icons on my diagrams: CAP 12 years later Jepsen Netwok is reliable Starbucks Does Not Use Two-Phase Commit Latency: The New Web Performance Bottleneck http:/ /www.flaticon.com