Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[ICSE '17] Zero-Downtime SQL Database Schema Ev...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Michael de Jong
May 25, 2017
Research
1
840
[ICSE '17] Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
Presentation given at ICSE 2017 in Buenos Aires in the "Software Engineering In Practice" track.
Michael de Jong
May 25, 2017
Tweet
Share
More Decks by Michael de Jong
See All by Michael de Jong
[SIC '16] Zero-Downtime Database Schema Evolution
michaeldejong
0
160
Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
michaeldejong
0
380
[RELENG '15] Continuous Deployment & Schema Evolution in SQL Databases
michaeldejong
0
1.4k
[CHISEL] Introduction to zero-downtime schema evolution in SQL databases.
michaeldejong
2
250
[CHISEL] Software Engineering at Magnet.me
michaeldejong
1
340
Contributing to RxJava
michaeldejong
2
640
Other Decks in Research
See All in Research
ForestCast: Forecasting Deforestation Risk at Scale with Deep Learning
satai
3
390
それ、チームの改善になってますか?ー「チームとは?」から始めた組織の実験ー
hirakawa51
0
660
Remote sensing × Multi-modal meta survey
satai
4
710
Sat2City:3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion
satai
4
660
製造業主導型経済からサービス経済化における中間層形成メカニズムのパラダイムシフト
yamotty
0
480
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
1.1k
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
satai
3
590
HoliTracer:Holistic Vectorization of Geographic Objects from Large-Size Remote Sensing Imagery
satai
3
620
視覚から身体性を持つAIへ: 巧緻な動作の3次元理解
tkhkaeio
0
190
ドメイン知識がない領域での自然言語処理の始め方
hargon24
1
240
20年前に50代だった人たちの今
hysmrk
0
140
Proposal of an Information Delivery Method for Electronic Paper Signage Using Human Mobility as the Communication Medium / ICCE-Asia 2025
yumulab
0
170
Featured
See All Featured
Balancing Empowerment & Direction
lara
5
890
Agile that works and the tools we love
rasmusluckow
331
21k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
250
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.3k
Ecommerce SEO: The Keys for Success Now & Beyond - #SERPConf2024
aleyda
1
1.8k
Writing Fast Ruby
sferik
630
62k
RailsConf 2023
tenderlove
30
1.3k
A Tale of Four Properties
chriscoyier
162
24k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
The agentic SEO stack - context over prompts
schlessera
0
640
How to Think Like a Performance Engineer
csswizardry
28
2.4k
Transcript
Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
Michael de Jong Magnet.me Arie van Deursen Delft University of
Technology Anthony Cleve University of Namur
None
30+ 1.200+ 90.000+ 2 Employees Companies Students Countries
Full-time developers (Micro)services Master is green? deploy! No manual testing!
6 20+
We deploy frequently and we’re not the only ones [3][13]
version deploys/month schema changes/month ~1.400 ~16
version deploys/month schema changes/month ~1.400 ~16 Roughly doubled since writing
the paper
Blocking schema operations Why are my schema changes crashing my
service?
Mixed-State How to deal with combinations of different versions of
the database schema and your application
Result: manual deploys Can’t deploy new schema during the day
because it’s not safe, so you have to deploy it manually at night
Result: manual deploys Can’t deploy new schema during the day
because it’s not safe, so you have to deploy it manually at night Not Continuous Deployment
Understanding behaviour blocking
Experiments Created test database Simulated application running on top Tested
19 different schema operations Using different version of MySQL / PG Plotted when queries were executed
Activity plots Adding a non-nullable column (PG 9.4) Making a
column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Before applying schema operation Adding a non-nullable column
(PG 9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots While applying schema operation Adding a non-nullable column
(PG 9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Adding a non-nullable column (PG 9.4) Making a
column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Blocking Adding a non-nullable column (PG 9.4) Making
a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Read-only Blocking Adding a non-nullable column (PG 9.4)
Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Blocking Read-only Non-blocking Adding a non-nullable column (PG
9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
MySQL 5.5 MySQL 5.6 PostgreSQL 9.3 PostgreSQL 9.4 S1 Read-only
Non-Blocking Blocking Blocking S2 Read-only Non-Blocking Non-Blocking Non-Blocking S3 Read-only Non-Blocking Non-Blocking Non-Blocking S4 Read-only Non-Blocking Non-Blocking Non-Blocking S5 Read-only Non-Blocking Non-Blocking Non-Blocking S6 Read-only Non-Blocking Non-Blocking Non-Blocking S7 N/A N/A Non-Blocking Non-Blocking S8 N/A N/A Non-Blocking Non-Blocking S9 N/A N/A Blocking Blocking S10 Read-only Read-only Non-Blocking Non-Blocking S11 Read-only Read-only Blocking Blocking S12 Non-Blocking Non-Blocking Non-Blocking Non-Blocking S13 Non-Blocking Non-Blocking Non-Blocking Non-Blocking S14 Read-only Read-only Blocking Blocking S15 Read-only Read-only Blocking Blocking S16 Read-only Non-Blocking Non-Blocking Non-Blocking S17 N/A N/A Non-Blocking Blocking S18 Blocking Non-Blocking Non-Blocking Non-Blocking S19 Non-Blocking Non-Blocking Non-Blocking Non-Blocking
Observations Schema changes are high-risk CD leads to frequent schema
changes We need to automate this process Are there any tools already out there?
OpenArk Kit TableMigrator pt-online-schema-change (Percona) Large Hadron Migrator (SoundCloud) Online
Schema Change (Facebook) Gh-ost (GitHub) Existing tools
Limitations Only support MySQL None support defining changesets All use
an atomic switchover None support Referential Integrity
R1: Non-Blocking Schema Changes R2: Schema Changesets R3: Concurrently Active
Schemas R4: Referential Integrity R5: Schema Isolation R6: Non-Invasive R7: Resilience
QuantumDB
movies rentals customers Netflix 2.0
changelog.addChangeSet(“Add referral column to customers table”, addColumn(“customers”, “referred_by”, integer()), addForeignKey(“customers”,
“referred_by”) .named(“customer_referred_by_fk”) .onDelete(NO_ACTION) .referencing(“customers”, “id”)); Defining changes
movies rentals customers Evolving the schema
movies rentals customers customers* Evolving the schema Forking the table
under change
Ghost table: Structural copy of an existing table with schema
changes already applied to it movies rentals customers customers* Evolving the schema Forking the table under change
movies rentals customers customers* Evolving the schema Forking the table
under change
movies rentals rentals* customers customers* Evolving the schema Forking dependent
tables
movies rentals* customers* rentals customers Evolving the schema A structurally
forked database schema
movies rentals* customers* rentals customers Evolving the schema Creating forward
database triggers (1-way sync)
movies rentals* customers* rentals customers Evolving the schema Copy data
from original tables to their ghost tables
movies customers* rentals customers rentals* Evolving the schema Creating backward
database triggers (2-way sync)
movies rentals* customers* rentals customers Accessing the database Original Schema
New Schema Original Schema movies rentals* customers* rentals customers Accessing
the database Two co-existing database schemas
Application Driver Accessing the database
Application Wrapper Driver Accessing the database Intercepting database interactions
Connection connection = DriverManager.getConnection( “jdbc:postgresql://localhost:5432/database”, “username”, “password”); Connection connection =
DriverManager.getConnection( “jdbc:quantumdb:postgresql://localhost:5432/database? version=80bfa11”,“username”, “password”); Accessing the database Intercepting database interactions
SELECT * FROM rentals WHERE customer_id = 2372 AND return_date
< NOW() AND returned = false; SELECT * FROM rentals* WHERE customer_id = 2372 AND return_date < NOW() AND returned = false; Accessing the database Rewriting intercepted queries
movies rentals customers Dropping a version When it’s no longer
needed rentals* customers*
movies rentals customers Dropping a version Drop the database triggers
rentals* customers*
movies rentals* customers* Dropping a version Drop tables only used
in the obsolete version
movies rentals* customers* Dropping a version Leaving you with the
resulting schema New Schema
Evaluation
Experiments
MySQL 5.5 MySQL 5.6 PostgreSQL 9.3 PostgreSQL 9.4 PG 9.4
+ QuantumDB S1 Read-only Non-Blocking Blocking Blocking Non-Blocking S2 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S3 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S4 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S5 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S6 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S7 N/A N/A Non-Blocking Non-Blocking Non-Blocking S8 N/A N/A Non-Blocking Non-Blocking Non-Blocking S9 N/A N/A Blocking Blocking Non-Blocking S10 Read-only Read-only Non-Blocking Non-Blocking Non-Blocking S11 Read-only Read-only Blocking Blocking Non-Blocking S12 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S13 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S14 Read-only Read-only Blocking Blocking Non-Blocking S15 Read-only Read-only Blocking Blocking Non-Blocking S16 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S17 N/A N/A Non-Blocking Blocking Non-Blocking S18 Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S19 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking
Industry case
Changesets (11 month period ’15/‘16) of these contain blocking operations
95 37%
supported by QuantumDB partially supported not yet supported 64% 25%
11%
Due to missing support for Executing DML queries during migration
User-defined functions User-defined database triggers User-defined views
Benchmarking “Normal” method Done using Liquibase Requires downtime in production
(ie. serving error pages or limiting functionality) ~8 minutes in total for all 95 changesets (on test server without any load on the database)
Benchmarking QuantumDB method Done using QuantumDB No downtime required at
all (ie. service can operate normally) ~2h 25m in total for all 95 changesets (on test server without any load on the database)
Where to next?
Add more support for views, functions, and triggers Start testing
in a live production environment Which allows us to get real-world measurements, and developer input
Future research direction? Using QuantumDB to perform A/B testing of
database schemas
Summary CD leads to frequent code/schema deploys Not all schema
ops are safe in production We need a tool to do evolution safely QuantumDB is that tool
QuantumDB Zero-Downtime SQL Database Schema Evolution for Continuous Deployment Open-Source
github.com/quantumdb quantumdb.io We need your help! Contribute code, insights, or more research