Zero-downtime
schema evolution
in SQL databases
Without disruption, or performance degradation!
Slide 6
Slide 6 text
Relevance
• Web applications
• Healthcare systems
• Banking / financial systems
• Air travel / booking systems
• Anything which is too expensive to go offline
• Continuous Deployment
• OpenArk kit
• Percona toolkit
• Large Hadron Migrator
• Table Migrator
• Online Schema Change
Solutions
Tools from industry
All work more or less
the same way
Slide 10
Slide 10 text
1. Current situation
Slide 11
Slide 11 text
1. Current situation
Slide 12
Slide 12 text
2. Create clone of table
Slide 13
Slide 13 text
3. Copy data to new table
Slide 14
Slide 14 text
3. Copy data to new table
Slide 15
Slide 15 text
4. Atomically rename table
Slide 16
Slide 16 text
• PRISM++
• IMAGO
• Column-oriented databases (noSQL)
• Google’s Spanner & F1 database (newSQL)
Solutions
Academic research
Slide 17
Slide 17 text
Challenges
• Referential integrity support.
• Versioning schema changes.
• Rolling back failures.
• Complex and mostly manual task.
• Mostly tools only support MySQL.
• Switching to alternative database can be costly.
What could go wrong?
Slide 18
Slide 18 text
What do we do now?
Slide 19
Slide 19 text
Approach
1. Create tool to test databases for blocking
behaviour.
2. Profile commonly used SQL databases.
3. Prototype solution to deal with schema changes.
4. Verify prototype shows no blocking behaviour
with said profiling tool.
5. Test in production environments / case studies.
Run away if that fails
Slide 20
Slide 20 text
Nemesis
1. Create a table and insert 100m records.
2. “Simulate load” by barraging the database with
SELECT, UPDATE, INSERT, and DELETE queries.
3. Execute a DDL query to modify the table’s
structure.
4. Record all start and end times for every query.
5. Plot the results.
Slide 21
Slide 21 text
Time (100ms periods)
Duration of queries
(1px = 1ms)
Period where DDL
statement is executing
Colors represent different
types of DML queries
Slide 22
Slide 22 text
The results
I have no idea what I’m doing…
Slide 23
Slide 23 text
Add nullable column
Add non-nullable column
Rename nullable column
Rename non-nullable column
Drop nullable column
Drop non-nullable column
Create index
Rename index
Drop index
!
Make column nullable
Make column non-nullable
Set default on nullable column
Set default on non-nullable column
Modify type on nullable column
Modify type on non-nullable column
Modify type from int to text
Add non-nullable foreign key
Add nullable foreign key
!
Non-blocking
Semi-blocking
Blocking
Too fast to profile
Slide 24
Slide 24 text
Add nullable column
Add non-nullable column
Rename nullable column
Rename non-nullable column
Drop nullable column
Drop non-nullable column
Create index
Rename index
Drop index
!
Make column nullable
Make column non-nullable
Set default on nullable column
Set default on non-nullable column
Modify type on nullable column
Modify type on non-nullable column
Modify type from int to text
Add non-nullable foreign key
Add nullable foreign key
!
Non-blocking
Semi-blocking
Blocking
Too fast to profile
Slide 25
Slide 25 text
Take away
• Don’t ever trust DDL statements
that operate on a “live” table.
• Unless you have a very intimate
understanding of your own SQL database.
• And then still don’t trust DDL statements.
Slide 26
Slide 26 text
What’s next?
Been there, done that…
Slide 27
Slide 27 text
Next up
• Discuss results on mailing lists
of profiled SQL databases.
• I really should start blogging about this…
• Lock myself up in a room,
and start working on a prototype.