Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Surviving Data in Large Doses

Surviving Data in Large Doses

NoSQL Search Roadshow London 2013

Tareq Abedrabbo

November 20, 2013

More Decks by Tareq Abedrabbo

Other Decks in Technology


  1. Surviving Data in Large Doses Tareq Abedrabbo NoSQL Search Roadshow

    London 2013
  2. About me • CTO at OpenCredo • Delivering large-scale data

    projects in a number of domains • Co-author of Neo4j in Action (Manning)
  3. What this talk is about…

  4. Supermarkets

  5. Meanwhile, in DevLand

  6. Bob is an application developer

  7. Bob wants to build an application. Bob knows that a

    relational database is definitely not the right choice for his application
  8. Bob chooses a NoSQL database because he likes it (he

    secretly thinks it’s good for his CV too).
  9. Bob goes for a three-tier architecture. It’s separation of concerns.

    It’s best practice.
  10. Bob builds an object model first. It’s Domain Driven Design.

    It’s best practice.
  11. Bob uses an object mapping framework. Databases should be hidden

    behind layers of abstraction. It’s best practice.
  12. Bob hopes for the best!

  13. What challenges is Bob facing?

  14. Suitability of the data model

  15. Suitability of the architecture and the implementation

  16. Ability to meet new requirements

  17. Being able to use the selected technology to the best

    of its ability
  18. Performance

  19. A number of applications built on top of NoSQL technologies

    end up unfit for purpose
  20. How did we get ourselves into such a mess?

  21. • Technical evangelism • Evolution in requirements • Unthinking decisions

    • Ill-informed opinions
  22. Common problem: there is focus on technology and implementation, not

    on real value
  23. So what’s the alternative?

  24. Separation of concerns based on data flow

  25. Data flow

  26. • Lifecycle • Structure • Size • Velocity • Purpose

  27. How?

  28. Identify the concerns: what do I care about?

  29. Identify the locality of these concerns: where are the natural

  30. Build focused specialised models

  31. Compose the models into a complete system

  32. Computing is data structures + algorithms

  33. If we accept that separation of concerns should be applied

    to algorithms, it is appropriate to apply the same thinking to data
  34. The real value of this form of separation of concerns

    is true decoupling
  35. What’s out there

  36. CQRS

  37. Polyglot Persistence

  38. How do I apply it?

  39. It depends on the data flow :)

  40. For general-purpose data platforms, micro services work well

  41. Build micro services that are closer to the natural underlying

  42. Other strategies are possible, for example if the data is

    highly volatile, consider in-memory grids
  43. There are practical considerations - obviously

  44. Don’t start with 10 different databases because you think you

    might eventually need all of them
  45. How would that impact support and operations?

  46. There is potential for simplification based on clearly targeted usage

  47. Links • Twitter: @tareq_abedrabbo • Blog: http://www.terminalstate.net • OpenCredo: http://www.opencredo.com

    Thank you!