Staying in Sync: From Transactions to Streams

Staying in Sync: From Transactions to Streams

Slides from a talk given at QCon London on 7 March 2016.
http://martin.kleppmann.com/2016/03/07/qcon-london.html

Abstract:

For the very simplest applications, a single database is sufficient, and then life is pretty good. But as your application needs to do more, you often find that no single technology can do everything you need to do with your data. And so you end up having to combine several databases, caches, search indexes, message queues, analytics tools, machine learning systems, and so on, into a heterogeneous infrastructure…

Now you have a new problem: your data is stored in several different places, and if it changes in one place, you have to keep it in sync in the other places, too. It’s not too bad if all your systems are up and running smoothly, but what if some parts of your systems have failed, some are running slow, and some are running buggy code that was deployed by accident?

It’s not easy to keep data in sync across different systems in the face of failure. Distributed transactions and 2-phase commit have long been seen as the “correct” solution, but they are slow and have operational problems, and so many systems can’t afford to use them.

In this talk we’ll explore using event streams and Kafka for keeping data in sync across heterogeneous systems, and compare this approach to distributed transactions: what consistency guarantees can it offer, and how does it fare in the face of failure?

References:

1. Mahesh Balakrishnan, Dahlia Malkhi, Ted Wobber, et al.: “Tango: Distributed Data Structures over a Shared Log,” at 24th ACM Symposium on Operating Systems Principles (SOSP), pages 325–340, November 2013. http://research.microsoft.com/pubs/199947/Tango.pdf

2. Molly Bartlett Dishman and Martin Fowler: “Agile Architecture,” at O'Reilly Software Architecture Conference, March 2015. http://conferences.oreilly.com/software-architecture/sa2015/public/schedule/detail/40388

3. Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.: “All Aboard the Databus!,” at ACM Symposium on Cloud Computing (SoCC), October 2012. http://www.socc2012.org/s18-das.pdf

4. Pat Helland: “Life beyond Distributed Transactions: an Apostate’s Opinion,” at 3rd Biennial Conference on Innovative Data Systems Research (CIDR), pages 132–141, January 2007. http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf

5. Pat Helland: “Immutability Changes Everything,” at 7th Biennial Conference on Innovative Data Systems Research (CIDR), January 2015. http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf

6. Martin Kleppmann: “Designing Data-Intensive Applications.” O’Reilly Media, to appear. http://dataintensive.net/

7. Jay Kreps: “I ♥︎ Logs.” O'Reilly Media, September 2014. http://shop.oreilly.com/product/0636920034339.do

8. Jay Kreps: “Putting Apache Kafka to use: A practical guide to building a stream data platform.” 25 February 2015. http://blog.confluent.io/2015/02/25/stream-data-platform-1/

9. Leslie Lamport: “Time, Clocks, and the Ordering of Events in a Distributed System,” Communications of the ACM, volume 21, number 7, pages 558–565, July 1978. http://research.microsoft.com/en-US/um/people/Lamport/pubs/time-clocks.pdf

10. Neha Narkhede: “Announcing Kafka Connect: Building large-scale low-latency data pipelines.” 18 February 2016. http://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines

11. Fred B Schneider: “Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial,” ACM Computing Surveys, volume 22, number 4, pages 299–319, December 1990. http://www.cs.cornell.edu/fbs/publications/smsurvey.pdf

12. Yogeshwer Sharma, Philippe Ajoux, Petchean Ang, et al.: “Wormhole: Reliable Pub-Sub to Support Geo-replicated Internet Services,” at 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), May 2015. https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-sharma.pdf

13. Martin Thompson: “Single Writer Principle.” 22 September 2011. http://mechanical-sympathy.blogspot.co.uk/2011/09/single-writer-principle.html

14. Vaughn Vernon: Implementing Domain-Driven Design. Addison-Wesley Professional, February 2013.

0d4ef9af8e4f0cf5c162b48ba24faea6?s=128

Martin Kleppmann

March 07, 2016
Tweet

Transcript

  1. None
  2. None
  3. None
  4. None
  5. None
  6. None
  7. None
  8. None
  9. None
  10. None
  11. None
  12. None
  13. None
  14. None
  15. None
  16. None
  17. None
  18. None
  19. None
  20. None
  21. None
  22. None
  23. None
  24. None
  25. None
  26. None
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. None
  39. None
  40. None
  41. None
  42. None
  43. None
  44. None
  45. None
  46. None
  47. None
  48. None
  49. None
  50. None
  51. None
  52. None
  53. None
  54. None
  55. None
  56. None
  57. None
  58. None
  59. None
  60. None
  61. None
  62. None
  63. None
  64. None
  65. None
  66. None
  67. None
  68. None
  69. None
  70. None
  71. None
  72. None
  73. None
  74. None
  75. None
  76. None
  77. None
  78. None
  79. None
  80. None
  81. None
  82. None
  83. None
  84. None
  85. None
  86. None
  87. None
  88. None
  89. None
  90. None
  91. None
  92. None
  93. None
  94. None
  95. None
  96. None
  97. None
  98. None
  99. References (1) 1.  Mahesh Balakrishnan, Dahlia Malkhi, Ted Wobber, et

    al.: “Tango: Distributed Data Structures over a Shared Log,” at 24th ACM Symposium on Operating Systems Principles (SOSP), pages 325–340, November 2013. http://research.microsoft.com/pubs/199947/Tango.pdf 2.  Molly Bartlett Dishman and Martin Fowler: “Agile Architecture,” at O'Reilly Software Architecture Conference, March 2015. http://conferences.oreilly.com/software-architecture/ sa2015/public/schedule/detail/40388 3.  Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.: “All Aboard the Databus!,” at ACM Symposium on Cloud Computing (SoCC), October 2012. http://www.socc2012.org/s18- das.pdf 4.  Pat Helland: “Life beyond Distributed Transactions: an Apostate’s Opinion,” at 3rd Biennial Conference on Innovative Data Systems Research (CIDR), pages 132–141, January 2007. http:// www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf 5.  Pat Helland: “Immutability Changes Everything,” at 7th Biennial Conference on Innovative Data Systems Research (CIDR), January 2015. http://www.cidrdb.org/cidr2015/Papers/ CIDR15_Paper16.pdf 6.  Martin Kleppmann: “Designing Data-Intensive Applications.” O’Reilly Media, to appear. http://dataintensive.net/ 7.  Jay Kreps: “I ♥︎ Logs.” O'Reilly Media, September 2014. http://shop.oreilly.com/product/ 0636920034339.do 8.  Jay Kreps: “Putting Apache Kafka to use: A practical guide to building a stream data platform.” 25 February 2015. http://blog.confluent.io/2015/02/25/stream-data-platform-1/
  100. References (2) 9.  Leslie Lamport: “Time, Clocks, and the Ordering

    of Events in a Distributed System,” Communications of the ACM, volume 21, number 7, pages 558–565, July 1978. http:// research.microsoft.com/en-US/um/people/Lamport/pubs/time-clocks.pdf 10. Neha Narkhede: “Announcing Kafka Connect: Building large-scale low-latency data pipelines.” 18 February 2016. http://www.confluent.io/blog/announcing-kafka-connect- building-large-scale-low-latency-data-pipelines 11. Fred B Schneider: “Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial,” ACM Computing Surveys, volume 22, number 4, pages 299–319, December 1990. http://www.cs.cornell.edu/fbs/publications/smsurvey.pdf 12. Yogeshwer Sharma, Philippe Ajoux, Petchean Ang, et al.: “Wormhole: Reliable Pub-Sub to Support Geo-replicated Internet Services,” at 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), May 2015. https://www.usenix.org/system/files/ conference/nsdi15/nsdi15-paper-sharma.pdf 13. Martin Thompson: “Single Writer Principle.” 22 September 2011. http://mechanical- sympathy.blogspot.co.uk/2011/09/single-writer-principle.html 14. Vaughn Vernon: Implementing Domain-Driven Design. Addison-Wesley Professional, February 2013.
  101. None