Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Do you really need that relational DB?

Do you really need that relational DB?

Say you're working with a big e-commerce project. From early days - you knew that normalised DB is the way to go and that's what you do nowadays. One day, the marketing department makes an amazing deal and your website is swarmed by thousands and thousands of customers! That's awesome! Except your DB is sweating so hard that in 2 minutes it releases the last breath and your caches expire. Now what?!

I had this crazy idea, deep in my mind, which no sane person believed in - perhaps we don't really need to query all those relations all the time? Perhaps there is a different way of modelling data for high load? If caching is so hard, are we doing it correctly? I'd like to share this idea with you, with some practical examples from production and theoretical guesses. My goal is to seed new ideas into your minds, feed them with a different approach which you might implement in the future. I'll be talking about data projections, events, queues, data modelling, and de-normalisation. Might mention a buzzword or two like NoSQL, AWS or DDD.

Transcript

  1. Do you really need that relational DB?

  2. Do you begin coding from DB structure?

  3. Do you begin coding from user's perspective?

  4. Why do we constantly select data that doesn't change? https://commons.wikimedia.org/wiki/File:Man-scratching-head.gif

  5. Reading same content from same source for each view. Why?

  6. It's all in the roots

  7. Context: Medium - High traffic

  8. Perceived speed is important • Pinterest increased search engine traffic

    and sign-ups by 15% when they reduced perceived wait times by 40%. • COOK increased conversions by 7%, decreased bounce rates by 7%, and increased pages per session by 10% when they reduced average page load time by 850ms. • Source: https://developers.google.com/web/fundamentals/performance/ why-performance-matters/
  9. RDBMS is slow - bbc.com

  10. RDBMS is slow - coolblue.nl

  11. How to stop relying on RDBMS in User facing apps?

  12. The setup

  13. Typical news portal

  14. Multiple slices of same data

  15. Multiple slices of same data

  16. Multiple slices of same data

  17. Multiple slices of same data

  18. Multiple slices of same data

  19. But it can be cached?

  20. But it can be cached? • Difficult to determine how

    long
  21. But it can be cached? • Difficult to determine how

    long • Big news - short cache
  22. But it can be cached? • Difficult to determine how

    long • Big news - short cache • Regular news - long cache
  23. But it can be cached? • Difficult to determine how

    long • Big news - short cache • Regular news - long cache • Too small TTL - more hardware needed
  24. But it can be cached? • Difficult to determine how

    long • Big news - short cache • Regular news - long cache • Too small TTL - more hardware needed • Too big TTL - the news are outdated of include mistakes
  25. No control over cached data https://commons.wikimedia.org/wiki/File:Emblem-evil-computer.svg

  26. Multiple users generating same cache because it expired

  27. Caching didn't save my project

  28. Static HTML file saved it

  29. So what else?

  30. None
  31. None
  32. Read Model

  33. Read model?

  34. Read model? • Read model in CQRS

  35. Read model? • Read model in CQRS • Aggregate root

    in DDD
  36. Read model? • Read model in CQRS • Aggregate root

    in DDD • A view in RDBMS
  37. A dataset crafted specifically for a single use case on

    the website
  38. None
  39. None
  40. None
  41. None
  42. None
  43. None
  44. None
  45. You don't need RDBMS for a news portal

  46. None
  47. Create seams

  48. Projector?

  49. Projector? • Comes from Event Sourcing

  50. Projector? • Comes from Event Sourcing • Responsible to project

    an event stream to any structural representation
  51. The Projector

  52. The Projector

  53. Projected Read Model

  54. No need for Event Sourcing

  55. Unlimited possibilities

  56. Unlimited possibilities • Convert straight to HTML

  57. Unlimited possibilities • Convert straight to HTML • Add a

    queue for big amounts of data
  58. Unlimited possibilities • Convert straight to HTML • Add a

    queue for big amounts of data • Hook up external services for integrations
  59. Unlimited possibilities • Convert straight to HTML • Add a

    queue for big amounts of data • Hook up external services for integrations • Project as many different formats as you want
  60. But...

  61. But... • Costly mistakes - a need to regenerate everything

  62. But... • Costly mistakes - a need to regenerate everything

    • A lot more moving parts, more complexity
  63. But... • Costly mistakes - a need to regenerate everything

    • A lot more moving parts, more complexity • Sounds a bit "backwards" and unusual
  64. None
  65. None
  66. None
  67. None
  68. None
  69. None
  70. None
  71. None
  72. How about a harder example?

  73. None
  74. None
  75. Optimisation for back office

  76. Optimisation for back office • Analysts will want every possible

    slice of your data
  77. Optimisation for back office • Analysts will want every possible

    slice of your data • Very different queries from what you use on production
  78. Optimisation for back office • Analysts will want every possible

    slice of your data • Very different queries from what you use on production • Difficult to find correct indexes to satisfy both worlds
  79. Cache

  80. Cache • Product?

  81. Cache • Product? • Lists?

  82. Cache • Product? • Lists? • Both?

  83. Cache • Product? • Lists? • Both? • Parts of

    the product?
  84. Cache • Product? • Lists? • Both? • Parts of

    the product? • Price?
  85. Cache • Product? • Lists? • Both? • Parts of

    the product? • Price? • Users regenerating it on random TTL?
  86. You don't need RDBMS to display a product page

  87. Multiple Read Models for similar data Storage is Cheap!

  88. Be careful of inconsistencies

  89. None
  90. None
  91. Dealing with lots of data

  92. NoSQL streams to the help

  93. NoSQL streams to the help • Store final result into

    NoSQL
  94. NoSQL streams to the help • Store final result into

    NoSQL • Streams trigger the projections
  95. NoSQL streams to the help • Store final result into

    NoSQL • Streams trigger the projections • AWS DynamoDB
  96. NoSQL streams to the help • Store final result into

    NoSQL • Streams trigger the projections • AWS DynamoDB • MongoDB Change Streams
  97. NoSQL streams to the help • Store final result into

    NoSQL • Streams trigger the projections • AWS DynamoDB • MongoDB Change Streams • Something else?
  98. Queues to the help • AWS SQS • RabbitMQ •

    Kafka
  99. No injection points?

  100. No injection points? • RDBMS triggers?

  101. No injection points? • RDBMS triggers? • Simple microservice that

    just polls DB
  102. No injection points? • RDBMS triggers? • Simple microservice that

    just polls DB • Slave replication
  103. No injection points? • RDBMS triggers? • Simple microservice that

    just polls DB • Slave replication • Webhook
  104. No injection points? • RDBMS triggers? • Simple microservice that

    just polls DB • Slave replication • Webhook • Anything that can trigger your projector
  105. None
  106. None
  107. None
  108. None
  109. None
  110. None
  111. Always start from the use case

  112. Read Model + Projector = No Cache

  113. Read Model + Projector = No Cache No Cache =

    Easier & Longer life
  114. You don't need a relational database*

  115. You don't need a relational database* * for user facing

    content
  116. Questions? Did you like the talk? 
 Was it crap?

    I wanna know, please share your feedback! https://joind.in/talk/d687d