Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Working with Huge Databases and Tables

Working with Huge Databases and Tables

Alternate Title: Zero-downtime Migration on Big Database

Presented at [TOKYO Rails #39](https://www.meetup.com/tokyo-rails/events/259500572/) on April 16, 2019.

F1c4a3bb1606cc4a61711e61e2fe6146?s=128

Prem Sichanugrist

April 16, 2019
Tweet

More Decks by Prem Sichanugrist

Other Decks in Programming

Transcript

  1. Working with Huge Databases and Tables Prem Sichanugrist (@sikachu) Tokyo

    Rails #39 2019-04-16
  2. Prem Sichanugrist @sikachu /sikachu

  3. Working with Huge Databases and Tables

  4. Migrations

  5. Zero-Downtime Migrations

  6. Why Zero-Downtime Migrations?

  7. Why Zero-Downtime Migrations? • Global service with users around the

    world ◦ Users using the service 27/4 from different time zones ◦ We want our service to always be available
  8. Why Zero-Downtime Migrations? • Global service with users around the

    world ◦ Users using the service 27/4 from different time zones ◦ We want our service to always be available • Normal migration process locks table ◦ We have a lot of user data (~300 GB on disk, biggest table has 500M rows) ◦ Normal migration locks the whole table and could take service down
  9. Why Zero-Downtime Migrations? • Global service with users around the

    world ◦ Users using the service 27/4 from different time zones ◦ We want our service to always be available • Normal migration process locks table ◦ We have a lot of user data (~300 GB on disk, biggest table has 500M rows) ◦ Normal migration locks the whole table and could take service down • We want to be able to do continuous deployment ◦ No “code freeze”, every merges to master goes to production right away ◦ Normal migration will cause deployment disruption throughout the day
  10. Zero-Downtime Migrations

  11. Percona Online Schema Change

  12. Percona Online Schema Change • Also known as: pt-osc

  13. Percona Online Schema Change • Also known as: pt-osc •

    Part of Percona Toolkit (http://www.percona.com/software/) ◦ Collection of advanced command-line tools used by Percona support staff — README.md
  14. Percona Online Schema Change • Also known as: pt-osc •

    Part of Percona Toolkit (http://www.percona.com/software/) ◦ Collection of advanced command-line tools used by Percona support staff — README.md • Provide command to perform online schema change ◦ Create new temporary table with updated schema ◦ Copy data over to the new table ◦ Drop old table, and rename temporary table to old table’s name (atomic)
  15. Percona Online Schema Change pt-online-schema-change \ --alter "ADD COLUMN column_name

    VARCHAR(255)" \ D=database_name,t=table_name https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-change.html
  16. Percona Online Schema Change

  17. Percona Online Schema Change + Rails

  18. Departure https://github.com/departurerb/departure

  19. Departure • Hooks into Active Record's connection adapter syntax ◦

    No need to write custom migration!
  20. Departure • Hooks into Active Record's connection adapter syntax ◦

    No need to write custom migration! • Construct arguments for pt-online-schema-change ◦ Then runs pt-online-schema-change to perform the migration
  21. None
  22. None
  23. None
  24. One Morning in Slack • A deploy to staging was

    going on for 3 hours.
  25. One Morning in Slack • A deploy to staging was

    going on for 3 hours • Deployment was stuck at migration step ◦ SRE team started investigate
  26. One Morning in Slack • A deploy to staging was

    going on for 3 hours • Deployment was stuck at migration step ◦ We tried to add a new column to a table that has about 100 million rows
  27. Percona Online Schema Change • Also known as: pt-osc •

    Part of Percona Toolkit (http://www.percona.com/software/) ◦ Collection of advanced command-line tools used by Percona support staff — README.md • Provide command to perform online schema change ◦ Create new temporary table with updated schema ◦ Copy data over to the new table ◦ Drop old table, and rename temporary table to old table’s name (atomic)
  28. One Morning in Slack • A deploy to staging was

    going on for 3 hours • Deployment was stuck at migration step ◦ We tried to add a new column to a table that has about 100M rows ◦ It takes a long time to copy over data of 100M rows to the new table
  29. One Morning in Slack • A deploy to staging was

    going on for 3 hours • Deployment was stuck at migration step ◦ We tried to add a new column to a table that has about 100M rows ◦ It takes a long time to copy over data of 100M rows to the new table • Deployment canceled, SRE team took over the database migration ◦ SRE team runs pt-online-schema-change directly on the server
  30. None
  31. InnoDB Online DDL Operations https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl-operations.html

  32. InnoDB Online DDL Operations • Available from MySQL 5.6+ ◦

    We are using Amazon RDS for MySQL 5.6+
  33. InnoDB Online DDL Operations • Available from MySQL 5.6+ ◦

    We are using Amazon RDS for MySQL 5.6+ • Various operations can be performed online ◦ CREATE INDEX, DROP INDEX ◦ ADD COLUMN, DROP COLUMN* ◦ Rename column, reorder column* ◦ SET DEFAULT, DROP DEFAULT, change column nullability ◦ Set AUTO_INCREMENT* ◦ Add/remove foreign key ◦ RENAME TABLE*
  34. InnoDB Online DDL Operations • Available from MySQL 5.6+ ◦

    We are using Amazon RDS for MySQL 5.6+ • Various operations can be performed online ◦ CREATE INDEX, DROP INDEX, ADD COLUMN, DROP COLUMN, … • Some DDL queries need to be modified to run online
  35. InnoDB Online DDL Operations ALTER TABLE tbl_name ADD COLUMN column_name

    column_definition; ALTER TABLE tbl_name ADD COLUMN column_name column_definition, ALGORITHM=INPLACE, LOCK=NONE;
  36. InnoDB Online DDL Operations ALTER TABLE tbl_name ALTER COLUMN col

    SET DEFAULT literal; ALTER TABLE tbl_name ALTER COLUMN col SET DEFAULT literal, ALGORITHM=INPLACE, LOCK=NONE;
  37. InnoDB Online DDL Operations CREATE INDEX name ON table (col_list);

    DROP INDEX name ON table; -- Already online operations
  38. InnoDB Online DDL Operations + Rails

  39. Index Operations • CREATE INDEX, DROP INDEX

  40. Index Operations • CREATE INDEX, DROP INDEX • Disable Departure,

    migration can be run during deploy if not too long
  41. Column Operations • ADD COLUMN, DROP COLUMN, rename column, change

    default value
  42. Column Operations • ADD COLUMN, DROP COLUMN, rename column, change

    default value • Can’t run directly as Rails’ generated SQL locks table
  43. Column Operations • ADD COLUMN, DROP COLUMN, rename column, change

    default value • Can’t run directly as Rails’ generated SQL locks table • Open new issue with SRE team, asking to run migration DDL with ALGORITHM=INPLACE, LOCK=NONE; ◦ Then, perform INSERT into schema_migrations with migration version directly
  44. None
  45. None
  46. Online DDL Operations are faster, but ... • We now

    require SRE’s help to perform migration tasks
  47. Online DDL Operations are faster, but ... • We now

    require SRE’s help to perform migration tasks • Too many steps to perform by human ◦ Open PR, get it approved ◦ Open ticket with SRE with online SQL command to run ◦ SRE has to run command in MySQL console, has to manually insert migration version ◦ Merges PR
  48. Online DDL Operations are faster, but ... • We now

    require SRE’s help to perform migration tasks • Too many steps required to perform by human ◦ Open PR, get it approved ◦ Open ticket with SRE with online SQL command to run ◦ SRE has to run command in MySQL console, has to manually insert migration version ◦ Merges PR ◦ Mistake can happen if not done properly
  49. None
  50. None
  51. Improvements • Create Departure-like gem to hook into Active Record’s

    Migration ◦ Generate SQL that tells MySQL to perform non-locking DDL operations ◦ Migration runs automatically during deployment just like normal migrations ◦ No need to ask for SRE help
  52. Improvements • Create Departure-like gem to hook into Active Record’s

    Migration ◦ Generate SQL that tells MySQL to perform non-locking DDL operations ◦ Migration runs automatically during deployment just like normal migrations ◦ No need to ask for SRE help • Separate migration deploy from code deployment ◦ Prevent blocking code deployment to production ◦ Deploy database migration first on a separate deploy process ▪ Mark that migration as finished, so migration in code PR will be no-op ◦ Needs to write code such that it still works before and after database migration ▪ Multiple steps process for renaming column/table.
  53. Alternatives • Use external tool to manage database schema instead

    ◦ Ridgepole (https://github.com/winebarrel/ridgepole) ◦ Store database schema in a separate database ◦ Database migrations can be apply independently from code changes
  54. Prem Sichanugrist @sikachu /sikachu Thank you! All Illustrations are from

    いらすとや (https://www.irasutoya.com)