Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Denver.rb - Database Migrations at Scale

Lucas Moore
November 14, 2017

Denver.rb - Database Migrations at Scale

This talk will outline the tools and processes we use to update our databases at Ibotta. This includes a deep dive into rails migrations, keeping our schema files in sync across teams, deploying code and, most excitingly, using a tool called gh-ost to migrate otherwise untenable tables without taking them off line.

Lucas Moore

November 14, 2017
Tweet

Transcript

  1. 2 2 - Who Am I? - Rails Generators Deep

    Dive - Tips for keeping things *nsync - Using gh-ost!
  2. Lucas Moore @thelucasmoore 3 Flatiron School Graduate Engineer #22 at

    Ibotta “Recovering Educator” Ethereum Enthusiast
  3. Rails Generators ActiveRecord ftw! 4 - $ rails generator [args]

    [options] - Follow with a rake task to migrate the database - Example [args] - $ rails g model - $ rails g migration - $ rails g scaffold
  4. $ rails g model 7 - Use case: a brand

    new model - Creates the model, migration & tests if done right - $ rails g model Pet name:string breed:string
  5. $ rails g migration 8 - Live in /db/migrate folder

    - ActiveRecord Migrations Lesson - Ruby on Rails Guides: Migrations - Use case: specific database changes - Creating and dropping tables - Adding, removing, renaming columns
  6. - $ rails g migration AddColumn - $ rails g

    migration AddNameToCustomers name:string - Just a nicely packaged DSL for executing SQL btw - These give us a #change method in our migrations Migrations 11
  7. Schema Files 24 - From the docs: “Schema dumps are

    the authoritative source for your database schema.” - At Ibotta, we have a local, staging and production environments - In production, we have master and read-only replicas too - Ugh, how to keep it all in sync - Source control and db/schema.rb with spacing, version, etc
  8. Git It Together 25 - $ git checkout develop db/schema.rb

    - $ rake db:migrate db:test:prepare - $ git add db/schema.rb --patch - $ git commit -m “yaknow” - $ git push - $ git stash
  9. $ git add --patch 27 - Blog Post - Each

    change is a hunk - y - Yes, add this hunk - n - No, don’t add this hunk - d - No, don’t add this hunk and all other remaining hunks. - s - Split the hunk into smaller hunks. - e - Manually edit the hunk. - Shoutout to Ben Limmer!
  10. Git It Together 28 - $ git checkout develop db/schema.rb

    - $ rake db:migrate db:test:prepare - $ git add db/schema.rb --patch - $ git commit -m “yaknow” - $ git push - $ git stash
  11. $ git stash 29 - $ git stash - $

    git stash apply - $ git stash list - $ git stash apply stash@{1}
  12. Timing Migrations 30 - When adding a new column -

    First run the migration, then deploy the code - ActiveRecord needs to register that attribute from the database! - When dropping a column - First deploy the code, then run the drop migration - Ensure the code no longer relies on that column, then drop
  13. What is gh-ost? - Gh-ost is a tool developed by

    Github to change the schema of large MySQL tables without needing to take the table offline. - https://github.com/github/gh-ost (shoutout to Maciek!) - Install & Use - $ wget https://github.com/github/gh-ost/releases/latest - $ tar xzf gh-ost-binary-linux-20170914095800.tar.gz - $ ./gh-ost [flags] 32
  14. The Problem 33 - How long would a typical Rails

    migration take? - When can we take this table offline for the migration? - What happens to the data lost in the middle of that migration?
  15. How it works 36 - Example table: customers - Gh-ost

    makes a table customers_gho - Data is slowly copied over from customers - The binary log stream captures changes *asynchronously* - Gh-ost listens to a socket in the /tmp folder - It waits to cut over to the new table, if properly configured
  16. Gh-ost is Testable 37 - $ ./gh-ost [flags] - Some

    example flags: - --user=”XXX” - --password=”XXX” - --database=”database_name” - --alter=”ADD COLUMN first_name VARCHAR(255)” - --verbose - --execute (leave this off to test connection/migration) - --test-on-replica (option c)
  17. Gh-ost is Pausable 38 - More flags! ./gh-ost [...] --panic-flag-file=/tmp/ghost.panic.flag

    --postpone-cut-over-flag-file=/tmp/ghost.postpone.master.flag --throttle-flag-file=/tmp/migration.throttle --throttle-additional-flag-file=/tmp/ghost.throttle - $ touch /tmp/ghost.panic.flag - $ touch /tmp/ghost.postpone.master.flag - $ rm /tmp/ghost.postpone.flag - Two throttle flags to pause for running multiple migrations
  18. Gh-ost is Nice 42 - Nice Ratio! - This ratio

    is based upon n*x ms units of time sleeping, where n is the set ratio and x is the amount of time it takes to copy one row. - Assuming a single rows chunk copy takes 100ms to complete: - nice-ratio=0.5 will cause gh-ost to sleep for 50ms (0.5 * 100ms) immediately following. - nice-ratio=1 will cause gh-ost to sleep for 100ms (1 * 100ms) effectively doubling runtime
  19. Caveats 44 - Foreign Keys! - Github Issue #331 -

    Not a panacea for multi-billion row and 100 gig tables - RBR - Row Based Replication is a requirement Limits
  20. Caveats 46 Credentials - Reverse Lookup “password” on your machine

    - Using cnf file - $ touch /tmp/file.cfn - Remove --username=”XXX” and --password=”XXX” flags - Instead add --conf=/tmp/file.cfn
  21. Our Process 47 - Write your code (maybe merge?) -

    Test the migration without --execute flag - Coordinate all migrations with the Platform Infrastructure Team - Pin a message in #alerts Slack channel - $ ssh into production server - Start a new tmux session and make cutover postpone flag - $ tmux new -s dbmigration - $ touch /tmp/ghost.postpone.flag - Run it! (on the next slide) - Go back and delete *_gho table after a few days
  22. A Successful Migration 48 ./gh-ost \ --user="USER" \ --password="XXX" \

    --max-load=Threads_running=50 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --max-lag-millis=1500 \ --host="master.db.ibotta.com" \ --allow-on-master \ --assume-master-host="master.db.ibotta.com" \ --database="ibotta" \ --table="rule_items" \ --verbose \ --alter="ADD COLUMN eval_type VARCHAR(255) DEFAULT 'presentation'" \ --allow-master-master \ --assume-rbr \ --cut-over=default \ --exact-rowcount \ --concurrent-rowcount \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.master.flag \ --throttle-additional-flag-file=/tmpghost.throttle \ --initially-drop-ghost-table \ --initially-drop-old-table \ (you can leave this off to test, of course) --execute