Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Shell Game Called Eventual Consistency

The Shell Game Called Eventual Consistency

As we build distributed highly scalable systems the central data store and transactions are no longer a safety net we can afford. In the world of event sourcing and CQRS (Command Query Responsibility Segregation) we need to design clever systems that don't show cracks and seams where eventual consistency is at play. We will tackle those unpleasant invariants and race conditions head on to investigate some technical and non technical smoke and mirror solutions that we can use to deliver a positive experience to end-users while finding the performance sweetspot.

We are utilizing the various PaaS/Serverless solutions to build more and more distributed systems. Often these systems need to work together to produce a result. When performance and scalability is of high priority, consistency (CAP theorem) takes a back seat. We still need to find ways to shelter the end-user from these design realities. The aim of this talk is to find ways of doing it. Be it through changing the business process or by doing clever tricks on the front end while giving the backend has a heartbeat to catch up. There are countless ways to do it. My goal is to investigate a few of them and get the conversations happening.

Dasith Wijesiriwardena

February 21, 2021
Tweet

More Decks by Dasith Wijesiriwardena

Other Decks in Programming

Transcript

  1. I am here because I have a love hate relationship

    with distributed systems. HELLO! I AM Dasith @dasiths dasith.me
  2. Dealing with Eventual Consistency Introduction to CAP theorem Agenda Recap

    and Conclusion Strong Consistency vs Availability @dasiths
  3. C-P Systems Write Example 1: All nodes are connected The

    system isn’t available until the changes are propagated to all nodes. @dasiths Consistent – Partition Tolerant
  4. C-P Systems Write Example 1: All nodes are connected Read

    The system isn’t available until the changes are propagated to all nodes. @dasiths Consistent – Partition Tolerant
  5. C-P Systems Write Example 1: All nodes are connected Read

    The system isn’t available until the changes are propagated to all nodes. @dasiths Speed Consistent – Partition Tolerant
  6. C-P Systems Write Read Example 1: All nodes are connected

    Write Example 2: Connectivity intermittent The system isn’t available until the changes are propagated to all nodes. @dasiths Speed Consistent – Partition Tolerant
  7. C-P Systems Write Read Example 1: All nodes are connected

    Write Example 2: Connectivity intermittent The system isn’t available until the changes are propagated to all nodes. @dasiths Speed Consistent – Partition Tolerant
  8. C-P Systems Write Read Example 1: All nodes are connected

    Write Example 2: Connectivity intermittent The system isn’t available until connectivity is re-established. The system isn’t available until the changes are propagated to all nodes. @dasiths Speed Consistent – Partition Tolerant
  9. • Strong consistency is critical. • Supports transactions that cover

    all nodes. • Generally easier to develop than A-P systems. C-P Systems @dasiths Consistent – Partition Tolerant
  10. A-P Systems Write Example 1: All nodes are connected Read

    The system isn’t immediately consistent but immediately available. Stale @dasiths Available – Partition Tolerant
  11. A-P Systems Write Example 1: All nodes are connected Read

    Read The system isn’t immediately consistent but immediately available. V1 V2 @dasiths Available – Partition Tolerant
  12. A-P Systems Write Example 1: All nodes are connected Read

    Read The system isn’t immediately consistent but immediately available. @dasiths Available – Partition Tolerant
  13. A-P Systems Write Example 1: All nodes are connected Read

    Read Write Example 2: Connectivity intermittent The system isn’t immediately consistent but immediately available. @dasiths Available – Partition Tolerant
  14. A-P Systems Write Example 1: All nodes are connected Read

    Read Write Example 2: Connectivity intermittent The system isn’t immediately consistent but immediately available. @dasiths Available – Partition Tolerant
  15. A-P Systems Write Example 1: All nodes are connected Read

    Read Write Example 2: Connectivity intermittent Read The system isn’t immediately consistent but immediately available. Stale @dasiths Available – Partition Tolerant
  16. Read A-P Systems Write Example 1: All nodes are connected

    Read Read The system isn’t immediately consistent but immediately available. Write Example 2: Connectivity intermittent Read The system is available but won’t be consistent until connectivity is re-established. V1 V2 @dasiths Available – Partition Tolerant
  17. A-P Systems Write Example 1: All nodes are connected Read

    Read The system isn’t immediately consistent but immediately available. Write Example 2: Connectivity intermittent The system is available but won’t be consistent until connectivity is re-established. @dasiths “Eventual Consistency” Available – Partition Tolerant Read Read V1 V2
  18. • Availability is critical. • No all node covering transactions.

    (i.e. 2-phase commit). • Introduces “lesser forms of consistency”. • Developing can be complicated. A-P Systems @dasiths Available – Partition Tolerant
  19. • Availability is a must. • RDBMS are not the

    only choice. • Strong consistency is expensive. •Storage cost is less than CPU/Memory costs. • Separate Read and Write optimised stores. In Hyperscale… @dasiths
  20. Transaction Boundary BEGIN TRANSACTION SELECT @QtyAvailable = Quantity FROM Products

    WHERE ProductId = @ProductId IF @QtyAvailable > @QtyOrdered UPDATE Products SET Quantity = Quantity - @QtyOrdered WHERE ProductId = @ProductId ELSE THROW 51000, 'The ordered qty is more than available.', 1; COMMIT TRANSACTION Example: Strong Consistency @dasiths (C)onsistent-(P)artition TolerantSystem
  21. Example: Strong Consistency Transaction Boundary View orders Order confirmed Order

    requested @dasiths (C)onsistent-(P)artition TolerantSystem
  22. Order requested Acknowledged Propagate to other partitions Example: Eventual Consistency

    Out of sync @dasiths (A)available-(P)artition TolerantSystem
  23. Order requested Acknowledged Propagate to other partitions Example: Eventual Consistency

    Out of sync View orders @dasiths (A)available-(P)artition TolerantSystem
  24. Order requested Acknowledged Propagate to other partitions Example: Eventual Consistency

    Out of sync View orders @dasiths Order Does Not Exist!!! (A)available-(P)artition TolerantSystem
  25. Order requested Acknowledged Propagated to other partitions Example: Eventual Consistency

    Order confirmed @dasiths (A)available-(P)artition TolerantSystem
  26. Order requested Acknowledged Propagated to other partitions Order confirmed Example:

    Eventual Consistency View orders @dasiths (A)available-(P)artition TolerantSystem
  27. Order requested Acknowledged Propagated to other partitions Order confirmed Example:

    Eventual Consistency View orders @dasiths (A)available-(P)artition TolerantSystem
  28. Quick Thoughts on Eventual Consistency … Unless you are using

    pessimistic locking, all data is stale, there are possibilities of optimistic concurrency failures. There is some period of time that it takes to build the DTOs, put them on the wire and for the client to receive them and draw them on the screen. Greg Young http://codebetter.com/gregyoung/2010/04/14/quick-thoughts-on-eventual-consistency/ @dasiths
  29. Starbucks Does Not Use Two-Phase Commit … In summary we

    can see that the real world is often asynchronous. Our daily lives consists of many coordinated, but asynchronous interactions (reading and replying to e-mail, buying coffee etc) … It also means that often we can look at daily life to help design successful messaging solutions. Gregor Hohpe https://www.enterpriseintegrationpatterns.com/ramblings/18_starbucks.html @dasiths
  30. Race Conditions Don’t Exist … Any time you see requirements

    that indicate a race condition, dig deeper. What you’re likely to find are some additional business concepts as well as the introduction of time and the creation of long-running business processes… Udi Dahan https://udidahan.com/2010/08/31/race-conditions-dont-exist/ @dasiths
  31. • Language Matters. Communicate actual status. ▪ Terminology like Submitted,

    Queued, Pending. • Multiple mediums of communication (Email, SMS etc). ▪ Let the customer know ASAP. • Compensating actions as first class domain concepts. ▪ Treat it as a long running process/workflow. Managing Expectations @dasiths
  32. Use local storage as a fake cache Ways To Handle

    Eventual Consistency on The UI Disable and long poll Use a thank you screen WebSockets to show live progress @dasiths
  33. Disable And Long Poll Order requested OrderId: ABC123 /api/order/ABC123 While

    result is 404 Order details @dasiths Redirect to order details
  34. Fake Cache Comment posted CommentId: ABC123 This is a stupid

    question!!! var localComments = JSON.parse(localStorage.getItem('comments')) || []; var newComment = { 'questionId': "blah";, 'commentId': "ABC123";, 'commentText': commentText, 'isFromLocal': true }; localComments.push(newComment); localStorage.setItem('comments', JSON.stringify(localComments)); @dasiths
  35. Fake Cache Comment posted CommentId: ABC123 This is a stupid

    question!!! @dasiths /api/quesion/blah/comments Doesn’t includes Lisa’s comment
  36. Fake Cache Comment posted CommentId: ABC123 This is a stupid

    question!!! questionId = "blah"; httpClient.Get(`api/question/${questionId}/comments`) .Subscribe(comments => { var localComments = JSON.parse(localStorage.getItem('comments')) || []; var questionComments = localComments .filter(q => q.questionId == questionId); // todo: Remove duplicates from local storage return [...comments, ...questionComments]; }) @dasiths
  37. I see my comment Fake Cache Comment posted CommentId: ABC123

    @dasiths /api/quesion/blah/comments Doesn’t includes Lisa’s comment But, I don’t see it yet!
  38. Fake Cache Comment posted CommentId: ABC123 This is a stupid

    question!!! /api/quesion/blah/comments @dasiths /api/quesion/blah/comments Includes Lisa’s comment Doesn’t includes Lisa’s comment I see my comment I see it now too!
  39. • Show details on screen if possible. • Communicate via

    multiple mediums. • Have a plan. Act Fast @dasiths
  40. Disable+Poll Thank You Fake Cache Live Progress • Easy to

    implement. • UX isn’t great. • Easy to implement. • Assumes the system processes the request before user navigates out. • Requires complicated logic. • Good UX if implemented properly. • It’s fake. Use for non- critical use cases. • Requires WebSockets/HTTP2 for async server calls. • UI needs to handle server events and display. • Good UX. Recap @dasiths
  41. Presentation template designed by powerpointify.com Special thanks to all people

    who made and shared these awesome resources for free: CREDITS Photographs by unsplash.com Free Fonts used: https://www.fontsquirrel.com/fonts/oswald @dasiths