Databases — the Choice Is Yours

Databases — the Choice Is Yours

In hardly any other area we’ve had as much change and improvement as in the field of databases. Just a few years ago everyone had “their” database that was used for each project. Today you are confronted with a variety of approaches and implementations.

We start off with a brief look at the theoretical background of distributed systems and databases in particular. After that, we discuss possible scenarios as well as the advantages and disadvantages of several databases:

* Why SQL is in fashion (again).
* Why MongoDB’s document structure fits object-oriented programming so well.
* How you can capture visitor hits with Redis efficiently.
* Why Cassandra is so scalable and fail-safe.
* How full-text search works with Elasticsearch.

The right choice of database(s) hasn’t become easier through the wide range of possibilities, but all the more interesting!

Ce4685da897c912aa41a815435b40a5a?s=128

Philipp Krenn

May 28, 2017
Tweet

Transcript

  1. Databases the Choice Is Yours Philipp Krenn̴̴̴̴̴̴̴@xeraa

  2. None
  3. None
  4. None
  5. Infrastructure | Developer Advocate

  6. ViennaDB Papers We Love Vienna

  7. None
  8. None
  9. "relational model and normal form" ↓ "data independence"

  10. Relational Algebra Book "Seven Databases in Seven Weeks"

  11. None
  12. Declarative Implementation detail of the RDBMS (Query Optimizer)

  13. User Interaction Aggregated reports Concurrency, integrity, consistency, type-safety

  14. Codd's 12 rules

  15. None
  16. From a Wide Variety of Databases to A Dark Age

  17. None
  18. http://blog.8thlight.com/uncle-bob/ 2012/05/15/NODB.html

  19. Slow & inflexible

  20. None
  21. None
  22. None
  23. https://aphyr.com/posts/288-the- network-is-reliable

  24. Beating the CAP Theorem Checklist Your ( ) tweet (

    ) blog post ( ) marketing material ( ) online comment advocates a way to beat the CAP theorem. Your idea will not work. http://ferd.ca/beating-the-cap-theorem-checklist.html
  25. Schema Flexibility http://pragprog.com/magazines/ 2012-05/beyond-the-bit-bucket

  26. None
  27. None
  28. None
  29. None
  30. NoSQL Not Only SQL

  31. None
  32. None
  33. Big Data Very broad, including NoSQL Offline data / data

    warehouse, Hadoop & Spark
  34. None
  35. http://db-engines.com/en/ranking

  36. None
  37. CouchDB, Couchbase, Elasticsearch,...

  38. Ted Neward: ORM is "The Vietnam of Computer Science" http://blogs.tedneward.com/post/the-vietnam-of-computer-

    science/
  39. /** * @ManyToMany(targetEntity="Phonenumber") * @JoinTable(name="users_phonenumbers", * joinColumns={@JoinColumn(name="user_id", referencedColumnName="id")}, * inverseJoinColumns={@JoinColumn(name="phonenumber_id",

    * referencedColumnName="id", unique=true) * } * ) */ public $phonenumbers;
  40. None
  41. abstract class Person { protected $name; } class Employee extends

    Person { protected $company; } class Manager extends Employee { protected $approveFunds; } class Worker extends Employee { protected $yearsExperience; }
  42. RDBMS 1. Union table with (many) NULL values

  43. RDBMS 2. Concrete instances without common queries

  44. RDBMS 3. Base table JOINed with concrete instances

  45. /** * @MappedSuperclass */ abstract class Person { /** @Id

    */ protected $id; /** @String */ protected $name; }
  46. /** * @Document(collection="person") * @InheritanceType("SINGLE_COLLECTION") * @DiscriminatorField(fieldName="type") * @DiscriminatorMap({ *

    "person"="Person", * "employee"="Employee", * "worker"="Worker" * }) */ class Employee extends Person { /** @String */ protected $company; }
  47. /** @Document */ class Manager extends Employee { /** @Boolean

    */ protected $approveFunds; } /** @Document */ class Worker extends Employee { /** @Integer */ protected $yearsExperience; }
  48. { "_id": ObjectId("5461c8bf9e2acf32ed50c079"), "name": "Peter", "company": "MongoDB", "approveFunds": true, "type":

    "manager" } { "_id": ObjectId("524d9fe7e4b0f8bd3031f84e"), "name": "Philipp", "company": "MongoDB", "yearsExperience": 10, "type": "worker" }
  49. None
  50. None
  51. Key-Value Store and more REmote DIctionary Server

  52. Statistics (Login) Bitset HyperLogLog

  53. Bitset First user bit 1, second user bit 2,... A

    million users ~123KB
  54. HyperLogLog Unique elements in 12KB regardless of their number Error

    rate ~0,81% http://redis.io/commands/pfcount
  55. Comparison HyperLogLog: constant size Bitset: exact & aggregates

  56. Overall Simple Omni-present "glue"

  57. None
  58. Scalable & highly available Consistent Hashing

  59. None
  60. None
  61. None
  62. None
  63. None
  64. * Without read and write QUORUM

  65. Eventually consistent was too hard, so we went with immediately

    inaccurate. — https://twitter.com/mysqlborat/status/621785755107524608
  66. None
  67. None
  68. SQL Databases not only RDBMS

  69. Cassandra Query Language (CQL) Google Query Language (GQL) Couchbase N1QL

    RethinkDB Query Language (ReQL)
  70. RDBMS features & maturity + NoSQL influence

  71. None
  72. Full-Text Search

  73. None
  74. Example These are <em>not</em> the droids you are looking for.

  75. html_strip Char Filter These are not the droids you are

    looking for.
  76. standard Tokenizer These̴are̴not̴the̴droids̴you̴are̴looking̴for

  77. lowercase Token Filter these̴are̴not̴the̴droids̴you̴are̴looking̴for

  78. stop Token Filter droids̴you̴looking

  79. snowball Token Filter droid̴you̴look

  80. Document Score Term Frequency (TF) / Inverse Document Frequency (IDF)

    Field-length norm https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html
  81. Conclusion

  82. None
  83. But is it fast?

  84. None
  85. None
  86. Thanks! Questions? Stickers @xeraa̴̴https://joind.in/talk/ d897d

  87. Credits Schnitzel https://flic.kr/p/9m27wm Architecture https://flic.kr/p/6dwCAe Conchita https://flic.kr/p/nBqSHT Consistent hashing https://

    highlyscalable.wordpress.com/2012/09/18/ distributed-algorithms-in-nosql-databases/ No, SQL https://twitter.com/edd/status/ 400190499585544192