Comparison of Database Types

Comparison of Database Types Mark Taylor | @willCodeForAle

A brief history of databases • 1725 – first punch
card system to control a loom • Flat file - fine for small amounts of data e.g. XML, CSV, JSON • 1960s - Hierarchical – tree structure e.g. filesystem, DNS • 1970 - Relational – structured table based data • 2000 – NoSQL – 1970+ Key value (rose in popularity 2000+) – 2009 - Document – 2000s – Graph – 2000s – Column family – 2010s – Time series

Relational databases

Relational Databases

Relational Databases - Pros • Tried and tested, reliable •
Strong consistency - ACID (Atomicity, Consistency, Isolation, Durability) • SQL is standardised and easy to get started with • Developers tend to be already familiar • Excellent range of tools and libraries available • Excellent driver support for all languages • Costs and risks generally understood • Basically, they’re a safe bet!

Relational Databases - Cons • Speed – The overhead of
relational limits query speed. Queries with many joins or very large tables have poorer performance. • Scaling – limited to the resources of the server, no partitioning due to consistency of the data model. Managing large amounts of data is difficult. • Cost – CPU is expensive, storage is cheap. Costs start to increase exponentially at a certain point due to server limits. • Requires predefined schema, which is more cumbersome to update.

Featured Relational DB - MySQL • 2nd ranked below Oracle
on DB Engines • Affordable! • Great community • Excellent library and driver support • Open source

NoSQL Databases • High performance, non relational databases • Flexible
– data is generally stored in a flexible structure • Scalable – partitioned horizontal scaling • Resilient – high availability is a core design factor • Fast – due to the way data is stored and queried • Around since the 2000s

Types of NoSQL Database • Document – data is stored
heirarchially in JSON documents • Key value – data is stored in key-value pairs • Graph – data is stored as a graph with nodes, edges and properties • Wide Column – Related data is stored as a set of nested key-value pairs in a single column

CAP Theorem • Distributed data systems always offer a trade-off
between consistency, availability and partition tolerance. • Consistency – each node in the cluster responds with the most up to date data • Availability – each node returns an immediate response, even if it’s not the most recent data • Partition Tolerance – guarantees the system will continue to operate even if one of the nodes in the cluster fails

NoSQL – Key Value Stores

NoSQL – Key Value Databases • Blazingly fast storage and
retrieval • Extremely scalable • Made up of two data items which are linked • Data stored is considered to be opaque to the database, no structured querying • Typically used for caching, session store, carts • No defined schema • Basic querying

Featured Key Value Store - Redis • Open source (BSD
licence) • Has many useful data structures like hashes, lists, sets • Supports master-slave data replication with failover • Supports transactions via a command queue • Can persist data on disk • Has a high availability offering via Redis Sentinel and automatic partitioning with Redis Cluster • Great client library support

NoSQL - Document Databases

NoSQL - Document Databases • Store documents using JSON, XML,
YAML or BSON (binary JSON) • Very flexible structure – optional fields • Can use indexes for faster performance • Sub-class of key-value store • Documents can be queried, unlike key/value • Allows partial updates of documents • Some implementations offer basic joins

Featured Document DB - DynamoDB • Fully managed SAAS •
Awesome for server-less applications • High performance – powers Amazon, Netflix, Lyft, Medium • Single-digit millisecond latency • Multi purpose – key value and document data models • ACID transactions • Flexible modelling • Flexible billing – on demand vs provisioned • Real time processing with DynamoDB streams

Learning DynamoDB • Check out Alex Debrie! - https://twitter.com/alexbdebrie •
https://www.dynamodbguide.com/ • https://www.dynamodbbook.com/ • AWS re:Invent – DynamoDB Deep Dive https://www.youtube.com/watch?v=HaEPXoXVf2k

NoSQL - Graph Databases

Graph Databases • Kind of like document databases, but with
relationships! • Nodes represent entities, edges represent relationships • Models are simpler and more expressive than relational • Flexible properties • Very flexible query languages e.g. Cypher, Gremlin • Relationships should be first class citizens • Joins are very expensive! Graph traversal is fast as relationships are known. • Useful for highly connected data, e.g. Facebook friends.

Featured Graph Database - Neo4J • Awesome Cypher query language,
very easy to get started • Great community and learning resources – easy to learn • Offers a open source community version • Offers Enterprise version with clustering and HA • ACID compliance • Strong driver support • Useful query browser

Column Family Databases

Column Family Databases • A column “family” is like a
table in relational database • Very high performance and highly scalable • Efficient at data compression and partitioning • Often used for Big Data, IoT due to fast insert and query speeds • Used by Spotify to store user profile attributes, artists, songs

Column Family Databases • Typically contains a row key as
the first column, which uniquely identifies that row. The following columns then contain a column key, which uniquely identifies that column within the row

NoSQL – Time Series Databases

NoSQL – Time Series Databases • Optimised for time-stamped or
time series data through associated pairs of time(s) and value(s) • Useful for high velocity logging metrics e.g. sensors, monitoring, clicks, stock trading • Optimised for measuring change over time and querying through aggregations • You’ve probably used it if you’ve used a product like New Relic or Graphana with Prometheus

Mark Taylor | @willCodeForAle

Comparison of Database Types

Comparison of Database Types

Mark Taylor

More Decks by Mark Taylor

Featured

Transcript