$30 off During Our Annual Pro Sale. View Details »

Alembic and SQLAlchemy: sane schema management

Alembic and SQLAlchemy: sane schema management

A shorter intro to schema management using alembic for Code Fellows

Selena Deckelmann

February 27, 2015
Tweet

More Decks by Selena Deckelmann

Other Decks in Technology

Transcript

  1. Sane Schema Management with
    Alembic and SQLAlchemy
    Selena Deckelmann
    Mozilla
    @selenamarie
    http://www.whitecells.org

    View Slide

  2. AMA

    Please, ask questions.

    Suggestions: Postgres, DB operations, Python
    app development, Mozilla webdev, GSoC,
    Outreachy (formerly OPW), Open Source/Free
    Software, Vim vs Emacs, memes involving
    setting things on fire, release engineering,
    messy diagrams, power lifting

    Can I get a scribe volunteer?

    View Slide

  3. I work on Socorro.
    http://github.com/mozilla/socorro
    http://crash-stats.mozilla.com
    PyLadies PDX, Founder
    Python Software Foundation, Director
    Ada Initiative, Advisor
    PostgreSQL, Major Contributor

    View Slide

  4. What's alembic?
    Alembic is a schema migration tool
    that integrates with SQLAlchemy.

    View Slide

  5. Database systems resist change.

    View Slide

  6. Evolution of schema change process

    View Slide

  7. What's sane schema management?
    Executing schema change in a
    controlled, repeatable way while
    working with developers and
    operations.

    View Slide

  8. Migration tools are really
    configuration management tools.

    View Slide

  9. Migrations are for:

    Communicating change

    Communicating process

    Executing change in a controlled,
    repeatable way with developers
    and operations

    View Slide

  10. My environment:

    Schema migrations are frequent.

    Automated schema migration is a goal.

    Stage environment is enough like
    production for testing.

    Writing a small amount of code is ok.

    View Slide

  11. Part 0: #dbaproblems
    Part 1: Picking the right migration tool
    Part 2: Using Alembic
    Part 3: Lessons Learned
    Part 4: Things Alembic could learn

    View Slide

  12. Process before Alembic:

    View Slide

  13. View Slide

  14. View Slide

  15. Process with Alembic:
    1.Make changes to model.py or
    raw_sql files
    2.Run: alembic revision –-auto-generate
    3.Edit revision file
    4.Commit changes
    5.Run migration after auto-deploy of
    a release

    View Slide

  16. Process with Alembic:
    1.Make changes to model.py or
    raw_sql files
    2.Run: alembic revision -–auto-
    generate
    3.Edit revision file
    4.Commit changes
    5.Run migration after auto-deploy of
    a release
    5.Have jenkins and travis-ci run
    downgrade/upgrade as part of test
    suite.
    6.Run migration automatically.

    View Slide

  17. Problems Alembic solved:

    Easy-to-deploy changes

    Can embed raw SQL, and
    issue multi-commit changes

    Includes downgrades

    View Slide

  18. Problems Alembic solved (continued):

    Enables database change discipline

    Enables code review discipline

    Revisions are decoupled from release
    versions and branch commit order

    View Slide

  19. Problems Alembic solved (continued):

    100k+ lines of code removed

    No more post-deploy schema
    checkins

    Enabling a tested, automated stage
    deployment

    Separated schema definition from
    version-specific configuration

    View Slide

  20. Photo courtesy of secure.flickr.com/photos/lambj
    HAPPY
    AS A CAT IN A BOX

    View Slide

  21. Shameful admission:
    We had three different ways of
    defining schema in our code and tests.

    View Slide

  22. Why use an ORM?

    View Slide

  23. A good ORM provides:

    Schema defined in one place

    Reusable components

    Integration with useful tools

    Database version independence

    Ability to use raw SQL

    View Slide

  24. And good ORM stewardship:

    Fits with developer workflows

    Enables partnership with developers

    Integrates with a testing framework

    View Slide

  25. And:

    Gives you a new way to think about
    schemas

    Develops compassion for how horrible
    ORMs can be

    Gives you developer-friendly vocabulary
    for discussing why ORM-generated code
    is often terrible

    View Slide

  26. Part 2: Using Alembic

    View Slide

  27. Practical Guide to using Alembic
    http://tinyurl.com/po4mal6

    View Slide

  28. https://alembic.readthedocs.org
    Vocabulary:
    revision: a single migration
    down_revision: previous migration
    upgrade: apply 'upgrade' change
    downgrade: apply 'downgrade' change
    offline mode: emit raw SQL for a
    change

    View Slide

  29. Installing and using:
    $ pip install alembic
    $ alembic init
    $ vi alembic.ini
    $ alembic revision -m “new”
    $ alembic upgrade heads
    $ alembic downgrade -1

    View Slide

  30. Defining a schema?
    vi env.py
    Add: import myproj.model

    View Slide

  31. Helper functions?
    Put your helper functions in a custom
    library and add this to env.py:
    import myproj.migrations

    View Slide

  32. Ignore certain schemas or partitions?
    In env.py:
    def include_symbol(tablename, schema):
    return schema in (None, "bixie") and
    re.search(r'_\d{8}$', tablename)
    is None

    View Slide

  33. Manage User Defined Functions?
    Chose to use raw SQL files
    3 directories, 128 files:
    procs/ types/ views/
    def load_stored_proc(op, filelist):
    procs_dir = os.path.normpath(os.path.join(
    __file__,
    '../../',
    'external/postgresql/raw_sql/procs'
    ))
    for filename in filelist:
    sqlfile = os.path.join(sqlfile,filename)
    with open(myfile, 'r') as stored_proc:
    op.execute(stored_proc.read())

    View Slide

  34. Stamping database revision?
    from alembic.config import Config
    from alembic import command
    alembic_cfg =
    Config("/path/to/yourapp/alembic.ini")
    command.stamp(alembic_cfg, "heads")

    View Slide

  35. Offline mode!
    alembic upgrade c1ac31c8fea -–sql
    alembic downgrade c1ac31c8fea:491cdcf9f97c --sql

    View Slide

  36. Demo with Socorro?
    Flask?
    https://flask-
    migrate.readthedocs.org/en/latest/

    View Slide

  37. Part 3: Lessons Learned

    View Slide

  38. Always roll forward.
    1.Put migrations in a separate
    commit from schema changes.
    2.Revert commits for schema
    change, leave migration commit
    in-place for downgrade support.

    View Slide

  39. Store schema objects in the smallest,
    reasonable, composable unit.
    1.Use an ORM for core schema.
    2.Put types, UDFs and views in
    separate files.
    3.Consider storing the schema in a
    separate repo from the application.

    View Slide

  40. Write tests. Run them every time.
    1.Write a simple tool to create a
    new schema from scratch.
    2.Write a simple tool to generate
    fake data.
    3.Write tests for these tools.
    4.When anything fails, add a test.

    View Slide

  41. Part 4: What Alembic could learn

    View Slide

  42. 1.Understand partitions
    2.Never apply a DEFAULT to a new
    column
    3.Help us manage UDFs better
    4.INDEX CONCURRENTLY
    5.Prettier syntax for multi-commit
    sequences

    View Slide

  43. Other tools:
    Sqitch
    http://sqitch.org/
    Written by PostgreSQL contributor
    Erwin
    http://erwin.com/
    Commercial, popular with Oracle
    South
    http://south.aeracode.org/
    Django-specific, well-supported

    View Slide

  44. Alembic resources:
    bitbucket.org/zzzeek/alembic
    alembic.readthedocs.org
    groups.google.com/group/sqlalchemy-
    alembic

    View Slide

  45. Sane Schema Management with
    Alembic and SQLAlchemy
    Selena Deckelmann
    Mozilla
    @selenamarie
    http://www.whitecells.org
    https://speakerdeck.com/selenamarie/sane
    -schema-management-with-alembic

    View Slide