Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[ICSE '17] Zero-Downtime SQL Database Schema Ev...
Search
Michael de Jong
May 25, 2017
Research
1
830
[ICSE '17] Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
Presentation given at ICSE 2017 in Buenos Aires in the "Software Engineering In Practice" track.
Michael de Jong
May 25, 2017
Tweet
Share
More Decks by Michael de Jong
See All by Michael de Jong
[SIC '16] Zero-Downtime Database Schema Evolution
michaeldejong
0
160
Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
michaeldejong
0
380
[RELENG '15] Continuous Deployment & Schema Evolution in SQL Databases
michaeldejong
0
1.3k
[CHISEL] Introduction to zero-downtime schema evolution in SQL databases.
michaeldejong
2
250
[CHISEL] Software Engineering at Magnet.me
michaeldejong
1
340
Contributing to RxJava
michaeldejong
2
630
Other Decks in Research
See All in Research
When Learned Data Structures Meet Computer Vision
matsui_528
1
230
スキマバイトサービスにおける現場起点でのデザインアプローチ
yoshioshingyouji
0
250
能動適応的実験計画
masakat0
2
950
一人称視点映像解析の最先端(MIRU2025 チュートリアル)
takumayagi
6
4.1k
若手研究者が国際会議(例えばIROS)でワークショップを企画するメリットと成功法!
tanichu
0
110
Combinatorial Search with Generators
kei18
0
1.1k
SNLP2025:Can Language Models Reason about Individualistic Human Values and Preferences?
yukizenimoto
0
190
Pythonでジオを使い倒そう! 〜それとFOSS4G Hiroshima 2026のご紹介を少し〜
wata909
0
1.1k
Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
satai
3
160
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
970
PhD Defense 2025: Visual Understanding of Human Hands in Interactions
tkhkaeio
1
280
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
870
Featured
See All Featured
Music & Morning Musume
bryan
46
6.9k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.3k
Rails Girls Zürich Keynote
gr2m
95
14k
Raft: Consensus for Rubyists
vanstee
140
7.2k
Done Done
chrislema
186
16k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Scaling GitHub
holman
463
140k
Transcript
Zero-Downtime SQL Database Schema Evolution for Continuous Deployment
Michael de Jong Magnet.me Arie van Deursen Delft University of
Technology Anthony Cleve University of Namur
None
30+ 1.200+ 90.000+ 2 Employees Companies Students Countries
Full-time developers (Micro)services Master is green? deploy! No manual testing!
6 20+
We deploy frequently and we’re not the only ones [3][13]
version deploys/month schema changes/month ~1.400 ~16
version deploys/month schema changes/month ~1.400 ~16 Roughly doubled since writing
the paper
Blocking schema operations Why are my schema changes crashing my
service?
Mixed-State How to deal with combinations of different versions of
the database schema and your application
Result: manual deploys Can’t deploy new schema during the day
because it’s not safe, so you have to deploy it manually at night
Result: manual deploys Can’t deploy new schema during the day
because it’s not safe, so you have to deploy it manually at night Not Continuous Deployment
Understanding behaviour blocking
Experiments Created test database Simulated application running on top Tested
19 different schema operations Using different version of MySQL / PG Plotted when queries were executed
Activity plots Adding a non-nullable column (PG 9.4) Making a
column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Before applying schema operation Adding a non-nullable column
(PG 9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots While applying schema operation Adding a non-nullable column
(PG 9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Adding a non-nullable column (PG 9.4) Making a
column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Blocking Adding a non-nullable column (PG 9.4) Making
a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Read-only Blocking Adding a non-nullable column (PG 9.4)
Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
Activity plots Blocking Read-only Non-blocking Adding a non-nullable column (PG
9.4) Making a column non-nullable (MySQL 5.6) Adding a non-nullable column (MySQL 5.6)
MySQL 5.5 MySQL 5.6 PostgreSQL 9.3 PostgreSQL 9.4 S1 Read-only
Non-Blocking Blocking Blocking S2 Read-only Non-Blocking Non-Blocking Non-Blocking S3 Read-only Non-Blocking Non-Blocking Non-Blocking S4 Read-only Non-Blocking Non-Blocking Non-Blocking S5 Read-only Non-Blocking Non-Blocking Non-Blocking S6 Read-only Non-Blocking Non-Blocking Non-Blocking S7 N/A N/A Non-Blocking Non-Blocking S8 N/A N/A Non-Blocking Non-Blocking S9 N/A N/A Blocking Blocking S10 Read-only Read-only Non-Blocking Non-Blocking S11 Read-only Read-only Blocking Blocking S12 Non-Blocking Non-Blocking Non-Blocking Non-Blocking S13 Non-Blocking Non-Blocking Non-Blocking Non-Blocking S14 Read-only Read-only Blocking Blocking S15 Read-only Read-only Blocking Blocking S16 Read-only Non-Blocking Non-Blocking Non-Blocking S17 N/A N/A Non-Blocking Blocking S18 Blocking Non-Blocking Non-Blocking Non-Blocking S19 Non-Blocking Non-Blocking Non-Blocking Non-Blocking
Observations Schema changes are high-risk CD leads to frequent schema
changes We need to automate this process Are there any tools already out there?
OpenArk Kit TableMigrator pt-online-schema-change (Percona) Large Hadron Migrator (SoundCloud) Online
Schema Change (Facebook) Gh-ost (GitHub) Existing tools
Limitations Only support MySQL None support defining changesets All use
an atomic switchover None support Referential Integrity
R1: Non-Blocking Schema Changes R2: Schema Changesets R3: Concurrently Active
Schemas R4: Referential Integrity R5: Schema Isolation R6: Non-Invasive R7: Resilience
QuantumDB
movies rentals customers Netflix 2.0
changelog.addChangeSet(“Add referral column to customers table”, addColumn(“customers”, “referred_by”, integer()), addForeignKey(“customers”,
“referred_by”) .named(“customer_referred_by_fk”) .onDelete(NO_ACTION) .referencing(“customers”, “id”)); Defining changes
movies rentals customers Evolving the schema
movies rentals customers customers* Evolving the schema Forking the table
under change
Ghost table: Structural copy of an existing table with schema
changes already applied to it movies rentals customers customers* Evolving the schema Forking the table under change
movies rentals customers customers* Evolving the schema Forking the table
under change
movies rentals rentals* customers customers* Evolving the schema Forking dependent
tables
movies rentals* customers* rentals customers Evolving the schema A structurally
forked database schema
movies rentals* customers* rentals customers Evolving the schema Creating forward
database triggers (1-way sync)
movies rentals* customers* rentals customers Evolving the schema Copy data
from original tables to their ghost tables
movies customers* rentals customers rentals* Evolving the schema Creating backward
database triggers (2-way sync)
movies rentals* customers* rentals customers Accessing the database Original Schema
New Schema Original Schema movies rentals* customers* rentals customers Accessing
the database Two co-existing database schemas
Application Driver Accessing the database
Application Wrapper Driver Accessing the database Intercepting database interactions
Connection connection = DriverManager.getConnection( “jdbc:postgresql://localhost:5432/database”, “username”, “password”); Connection connection =
DriverManager.getConnection( “jdbc:quantumdb:postgresql://localhost:5432/database? version=80bfa11”,“username”, “password”); Accessing the database Intercepting database interactions
SELECT * FROM rentals WHERE customer_id = 2372 AND return_date
< NOW() AND returned = false; SELECT * FROM rentals* WHERE customer_id = 2372 AND return_date < NOW() AND returned = false; Accessing the database Rewriting intercepted queries
movies rentals customers Dropping a version When it’s no longer
needed rentals* customers*
movies rentals customers Dropping a version Drop the database triggers
rentals* customers*
movies rentals* customers* Dropping a version Drop tables only used
in the obsolete version
movies rentals* customers* Dropping a version Leaving you with the
resulting schema New Schema
Evaluation
Experiments
MySQL 5.5 MySQL 5.6 PostgreSQL 9.3 PostgreSQL 9.4 PG 9.4
+ QuantumDB S1 Read-only Non-Blocking Blocking Blocking Non-Blocking S2 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S3 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S4 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S5 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S6 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S7 N/A N/A Non-Blocking Non-Blocking Non-Blocking S8 N/A N/A Non-Blocking Non-Blocking Non-Blocking S9 N/A N/A Blocking Blocking Non-Blocking S10 Read-only Read-only Non-Blocking Non-Blocking Non-Blocking S11 Read-only Read-only Blocking Blocking Non-Blocking S12 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S13 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S14 Read-only Read-only Blocking Blocking Non-Blocking S15 Read-only Read-only Blocking Blocking Non-Blocking S16 Read-only Non-Blocking Non-Blocking Non-Blocking Non-Blocking S17 N/A N/A Non-Blocking Blocking Non-Blocking S18 Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking S19 Non-Blocking Non-Blocking Non-Blocking Non-Blocking Non-Blocking
Industry case
Changesets (11 month period ’15/‘16) of these contain blocking operations
95 37%
supported by QuantumDB partially supported not yet supported 64% 25%
11%
Due to missing support for Executing DML queries during migration
User-defined functions User-defined database triggers User-defined views
Benchmarking “Normal” method Done using Liquibase Requires downtime in production
(ie. serving error pages or limiting functionality) ~8 minutes in total for all 95 changesets (on test server without any load on the database)
Benchmarking QuantumDB method Done using QuantumDB No downtime required at
all (ie. service can operate normally) ~2h 25m in total for all 95 changesets (on test server without any load on the database)
Where to next?
Add more support for views, functions, and triggers Start testing
in a live production environment Which allows us to get real-world measurements, and developer input
Future research direction? Using QuantumDB to perform A/B testing of
database schemas
Summary CD leads to frequent code/schema deploys Not all schema
ops are safe in production We need a tool to do evolution safely QuantumDB is that tool
QuantumDB Zero-Downtime SQL Database Schema Evolution for Continuous Deployment Open-Source
github.com/quantumdb quantumdb.io We need your help! Contribute code, insights, or more research