Relational is the new Big Data by Miguel Ángel Fajardo and Daniel Dominguez at Big Data Spain 2017

RELATIONAL is the new BIG DATA

Daniel Domínguez Head of Data Previously: CIEMAT / CERN @danieluchi01
Miguel Ángel Fajardo CTO Previously: EA Games, Gilt, Shutterstock @ma_bits

Distributed Processing Not Only SQL

A long time ago in a galaxy far, far away...

1960s Information Management System (IMS) by IBM • Built for
Saturn V moon rocket • Hierarchical, tree structure

1970 Relational Model Paper • Base for IBM DB1 and
DB2

1980s-90s Development of RDBMS • Widely adopted • Models easy
to define • ACID transactions • Clients for all stacks • ORMs

2000s Web 2.0 • Large volume (petabytes) • Faster networks
and devices • Systems must scale

Problems scaling Relational DBs • Sharding is hard • Maintaining
transactions ACID is hard • Two-phase commit is hard • Parallelizing is hard

Distributed Processing Not Only SQL

Relational databases The CAP theorem

Key-value stores ◦ User session data ◦ Component configuration ◦
Cached data, fast access ◦ Complex queries ◦ Interconnected data

Column-oriented DB ◦ Real time analytics ◦ Facebook Messenger ◦
Queries against few rows ◦ Flexible data schemas ◦ Incremental data loads/deletes

◦ Records with different fields ◦ Models with many layers
◦ Joins ◦ Flexible queries Document-oriented DB

Graph DB ◦ Routing ◦ Social networks ◦ Disease spreading
◦ Hard to do aggregates ◦ Analytics

No one magic database to rule them all • Each
of them fits a small number of use cases • Often hard, complex and expensive to maintain • Specific query languages CQL

MEANWHILE IN THE RELATIONAL BATCAVE

2010s Relational strikes back • Less structured data formats •
Partitioning • Parallel execution • Sharding • C, A and P?

• ACID for queries going to a single shard •
Open Source, DAAS • PostgreSQL extension • Interactive analytics • Multi-tenant • Fully ACID • Open Source • PostgreSQL fork • Scaling intensive • Multi-tenant

WHEN YOU HAVE A HAMMER...

Questions? tech.geoblink.com

Relational is the new Big Data by Miguel Ángel ...

Relational is the new Big Data by Miguel Ángel Fajardo and Daniel Dominguez at Big Data Spain 2017

Big Data Spain

More Decks by Big Data Spain

Other Decks in Technology

Featured

Transcript