Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Strategies to Edit Production Data

Julie Qiu
January 19, 2018

Strategies to Edit Production Data

https://www.devopsdays.org/events/2018-new-york-city/program/julie-qiu/

At some point, we all find ourselves at a SQL prompt making edits to the production database. We know it’s a bad practice, and we always intend to put in place safer infrastructure before we need to do it again, but what does a better system actually look like?

This talk progresses through 5 strategies for teams using a Python stack to do SQL writes against a database, to achieve increasing safety and auditability:

1. Raw SQL queries
2. Local one-off scripts
3. Deploy and run scripts from an application server
4. Run scripts from Jenkins with command line arguments
5. Build a Script Runner application

We’ll talk about the pros and cons of each strategy, and help you determine which one is right for your specific needs.

By the end of this talk, you’ll be ready to start upgrading your infrastructure for making changes to your production database safely!

Julie Qiu

January 19, 2018
Tweet

More Decks by Julie Qiu

Other Decks in Technology

Transcript

  1. 1. Add record to a spreadsheet Process for manual edits

    Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  2. Manual Queries Spreadsheet Name Julie Qiu Date 1/19/2018 Description Marketing

    team request Query UPDATE products SET name=‘juliies-product’ WHERE id=1 Reviewer Name Codanda Appachu Reviewer Status Approved / Changes Requested Reviewer Comments You spelled Julie wrong
  3. 1. Add record to a spreadsheet 2. Reviewer approves the

    query Process for manual edits Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  4. 1. Add record to a spreadsheet 2. Reviewer approves the

    query 3. Run the query Process for manual edits Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  5. Run the query inside a transaction BEGIN; UPDATE products SET

    name=‘julies-product’ WHERE id=1; END; Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  6. • Easy to implement What’s great Local Scripts Existing Server

    Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  7. • Easy to implement • Audit trail What’s great Local

    Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  8. • Easy to implement • Audit trail • Promotes the

    right behaviors What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  9. • Easy to make mistakes What’s not great Local Scripts

    Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  10. • Easy to make mistakes • Audit trail is at

    will What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  11. • Easy to make mistakes • Audit trail is at

    will • Difficult to run long and complex logic What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  12. 1. Write a script Run scripts locally Local Scripts Existing

    Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  13. Write the script with your ORM of choice UPDATE products

    SET name=‘product-1’ WHERE id=1 Script Runner Service Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  14. 1. Write a script 2. Connect to remote database Run

    scripts locally Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  15. 1. Write a script 2. Connect to remote database 3.

    Run the script Run scripts locally Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  16. python update_product_name.py --product_name=‘julies-product’ --product_id=1 --dry-run Run the script (but first

    do a dry-run!) Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  17. python update_product_name.py --product_name=‘julies-product’ --product_id=1 --dry-run Run the script (but do

    a dry-run!) Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  18. • Reusable What’s great Local Scripts Existing Server Task Runner

    Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  19. • Reusable • Easy to manipulate outputs What’s great Local

    Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  20. • Reusable • Easy to manipulate outputs • Access to

    common code What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  21. • Easy to make mistakes What’s not great Local Scripts

    Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  22. • Easy to make mistakes • Logs are only available

    locally What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  23. • Easy to make mistakes • Logs are only available

    locally • Network disconnections What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  24. UPDATE products SET name=‘product-1’ WHERE id=1 UPDATE products SET name=‘product-2’

    WHERE id=2 UPDATE products SET name=‘product-3’ WHERE id=3 Doing this 50 million times UPDATE …
  25. 1. Write a script Run on an existing server Local

    Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  26. 1. Write a script 2. Get the script onto a

    server Run on an existing server Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  27. 1. Write a script 2. Get the script onto a

    server 3. SSH and run inside a session Run on an existing server Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  28. • Ability to run long scripts What’s great Local Scripts

    Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  29. • Ability to run long scripts • Reliable network connectivity

    What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  30. • Ability to run long scripts • Reliable network connectivity

    • Infrastructure already exists What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  31. • Scripts can affect resources on your server What’s not

    great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  32. • Scripts can affect resources on your server • Not

    user friendly What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  33. • Scripts can affect resources on your server • Not

    user friendly • No persistent audit trail What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  34. Task Runner Database Local Scripts Existing Server Task Runner Script

    Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  35. 1. Write a script Use a task runner Local Scripts

    Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  36. 1. Write a script 2. Code review and run tests

    Use a task runner Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  37. 1. Write a script 2. Code review and run tests

    3. Input arguments and run Use a task runner Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  38. • Persistent audit logs What’s great Local Scripts Existing Server

    Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  39. • Persistent audit logs • Code review and automated tests

    What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  40. • Persistent audit logs • Code review and automated tests

    • User interface What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  41. User Interface Audit Trail Code Review Long Scripts Complex Logic

    Data Editing Testing Benefits Raw SQL Existing Server Local Scripts Task Runner
  42. • Hard to manage credentials What’s not great Local Scripts

    Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  43. • Hard to manage credentials • Environments are not clearly

    separated What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  44. • Hard to manage credentials • Environments are not clearly

    separated • Inputs are not verified What’s not great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  45. User Interface Script Runner (dev) Database (dev) Script Runner (production)

    Database (production) Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  46. 1. Write a script Script runner service Local Scripts Existing

    Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  47. 1. Write a script 2. Code review and run tests

    Script runner service Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  48. 1. Write a script 2. Code review and run tests

    3. Choose environment and run Script runner service Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  49. • Centralized configuration management What’s great Local Scripts Existing Server

    Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  50. • Centralized configuration management • Separation of environments What’s great

    Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  51. • Centralized configuration management • Separation of environments • User

    friendly interface What’s great Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  52. • Parallelize and scale More things we can do Local

    Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  53. • Parallelize and scale • Preview results More things we

    can do Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  54. • Parallelize and scale • Preview results • … up

    to you to customize! More things we can do Local Scripts Existing Server Task Runner Script Runner Service Local Scripts Raw SQL Raw SQL Raw SQL
  55. Centralize Configs Separate Environ User Interface Audit Trail Code Review

    Long Scripts Complex Logic Data Editing Testing Benefits … Raw SQL Existing Server Local Scripts Script Runner Service Task Runner Task Runner