Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zero-downtime upgrades with SQLAlchemy + Alembic

Zero-downtime upgrades with SQLAlchemy + Alembic

Alembic is a database migrations tool written by the author of SQLAlchemy. Alembic is easy to get started with an provides a number of sensible defaults out-of-the-box. However, like SQLAlchemy itself, when you look under the hood you find a seriously capable engine capable of powering all sorts of database schema migrations. It can even auto-generate the migration files themselves!

In this presentation, we seek to explore some of these advanced features by implementing a pattern - the expand-contract pattern - that can allow for near zero-downtime database upgrades.

Stephen Finucane

November 13, 2022
Tweet

More Decks by Stephen Finucane

Other Decks in Technology

Transcript

  1. Zero-downtime
    upgrades with
    Alembic
    A look at some of the power
    features of Alembic, the
    database migration tool
    PyCon IE 2022
    @stephenfin

    View Slide

  2. Stephen Finucane (@stephenfin)
    Senior Software Engineer
    Red Hat

    View Slide

  3. View Slide

  4. Alembic
    +

    View Slide

  5. Background

    View Slide

  6. class User(Base):
    __tablename__ = "user_account"
    id = Column(Integer, primary_key=True)
    first_name = Column(String(30))
    last_name = Column(String(30))
    addresses = relationship(
    "Address", back_populates="user", cascade="all, delete-orphan"
    )
    def __repr__(self):
    return f"User(id={self.id!r})"
    models.py

    View Slide

  7. ❯ alembic revision --autogenerate ...

    View Slide

  8. # revision identifiers, used by Alembic.
    revision = '6cb93d555e2b'
    down_revision = None
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    op.create_table(
    'user_account',
    sa.Column('id', sa.Integer(), nullable=False),
    sa.Column('first_name', sa.String(length=30), nullable=True),
    sa.Column('last_name', sa.String(length=30), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    )
    6cb93d555e2b_initial_migration.py

    View Slide

  9. ❯ alembic upgrade head

    View Slide

  10. .
    ├── alembic_expand_contract_demo
    │ ├── __init__.py
    │ ├── migrations
    │ │ ├── env.py
    │ │ ├── README
    │ │ ├── script.py.mako
    │ │ └── versions
    │ │ └── 6cb93d555e2b_initial_migration.py
    │ ├── models.py
    ├── alembic.ini
    ├── README.md
    └── requirements.txt

    View Slide

  11. .
    ├── alembic_expand_contract_demo
    │ ├── __init__.py
    │ ├── migrations
    │ │ ├── env.py
    │ │ ├── README
    │ │ ├── script.py.mako
    │ │ └── versions
    │ │ └── 6cb93d555e2b_initial_migration.py
    │ ├── models.py
    ├── alembic.ini
    ├── README.md
    └── requirements.txt

    View Slide

  12. The Problem

    View Slide

  13. class User(Base):
    __tablename__ = "user_account"
    id = Column(Integer, primary_key=True)
    first_name = Column(String(30))
    last_name = Column(String(30))
    addresses = relationship(
    "Address", back_populates="user", cascade="all, delete-orphan"
    )
    def __repr__(self):
    return f"User(id={self.id!r})"
    models.py

    View Slide

  14. class User(Base):
    __tablename__ = "user_account"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    addresses = relationship(
    "Address", back_populates="user", cascade="all, delete-orphan"
    )
    def __repr__(self):
    return f"User(id={self.id!r})"
    models.py

    View Slide

  15. ❯ alembic revision --autogenerate ...

    View Slide

  16. def upgrade() -> None:
    # schema migration - add new columns
    op.add_column(
    'user_account', sa.Column('name', sa.String(), nullable=True),
    )
    # ...some (manually added) data migrations...
    # schema migration - drop old columns
    op.drop_column('user_account', 'first_name')
    op.drop_column('user_account', 'last_name')
    0585005489f0_merge_user_names.py

    View Slide

  17. To do the schema and data migrations in one go,
    we’d need to shut down our instances…

    View Slide

  18. To do the schema and data migrations in one go,
    we’d need to shut down our instances…
    How do we avoid downtime? 🤔

    View Slide

  19. A Solution

    View Slide

  20. Expand - Contract

    View Slide

  21. Expand - Migrate - Contract

    View Slide

  22. first_name
    last_name
    first_name
    last_name
    name name
    expand migrate contract

    View Slide

  23. Let’s split up those migrations
    (manually, for now)

    View Slide

  24. revision = '9c36df1b3f62'
    down_revision = '6cb93d555e2b'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    # schema migration - add new columns
    op.add_column(
    'user_account', sa.Column('name', sa.String(), nullable=True),
    )
    9c36df1b3f62_merge_user_names_expand.py

    View Slide

  25. revision = 'e629a4ea0677'
    down_revision = '9c36df1b3f62'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    # schema migrations - contract
    op.drop_column('user_account', 'first_name')
    op.drop_column('user_account', 'last_name')
    9c36df1b3f62_merge_user_names_contract.py

    View Slide

  26. ❯ alembic upgrade 9c36df1b3f62 # expand hash
    ❯ alembic upgrade e629a4ea0677 # contract hash

    View Slide

  27. That’s kind of ugly, no? 🤮

    View Slide

  28. A Better Solution

    View Slide

  29. “A branch describes a point in a migration stream when two or
    more versions refer to the same parent migration as their
    ancestor.”
    Alembic Documentation

    View Slide

  30. revision = '9c36df1b3f62'
    down_revision = '6cb93d555e2b'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    # schema migration - add new columns
    op.add_column(
    'user_account', sa.Column('name', sa.String(), nullable=True),
    )
    9c36df1b3f62_merge_user_names_expand.py (before)

    View Slide

  31. revision = '9c36df1b3f62'
    down_revision = '6cb93d555e2b'
    branch_labels = ('expand',)
    depends_on = None
    def upgrade() -> None:
    # schema migration - add new columns
    op.add_column(
    'user_account', sa.Column('name', sa.String(), nullable=True),
    )
    9c36df1b3f62_merge_user_names_expand.py (after)

    View Slide

  32. revision = 'e629a4ea0677'
    down_revision = '9c36df1b3f62'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    # schema migrations - contract
    op.drop_column('user_account', 'first_name')
    op.drop_column('user_account', 'last_name')
    9c36df1b3f62_merge_user_names_contract.py (before)

    View Slide

  33. revision = 'e629a4ea0677'
    down_revision = '9c36df1b3f62'
    branch_labels = ('contract',)
    depends_on = None
    def upgrade() -> None:
    # schema migrations - contract
    op.drop_column('user_account', 'first_name')
    op.drop_column('user_account', 'last_name')
    9c36df1b3f62_merge_user_names_contract.py (after)

    View Slide

  34. ❯ alembic upgrade expand
    ❯ alembic upgrade contract

    View Slide

  35. Much better!
    .

    View Slide

  36. Much better!
    But we can do more…

    View Slide

  37. The “Best” Solution

    View Slide

  38. ❯ alembic revision --autogenerate ...

    View Slide

  39. def run_migrations_online() -> None:
    connectable = engine_from_config(
    config.get_section(config.config_ini_section),
    prefix="sqlalchemy.",
    poolclass=pool.NullPool,
    )
    with connectable.connect() as connection:
    context.configure(
    connection=connection,
    target_metadata=target_metadata,
    )
    with context.begin_transaction():
    context.run_migrations()
    env.py

    View Slide

  40. def run_migrations_online() -> None:
    connectable = engine_from_config(
    config.get_section(config.config_ini_section),
    prefix="sqlalchemy.",
    poolclass=pool.NullPool,
    )
    with connectable.connect() as connection:
    context.configure(
    connection=connection,
    target_metadata=target_metadata,
    process_revision_directives=autogen.process_revision_directives,
    )
    with context.begin_transaction():
    context.run_migrations()
    env.py

    View Slide

  41. process_revision_directives -
    a callable function that will be passed a structure representing
    the end result of an autogenerate or plain “revision” operation,
    which can be manipulated to affect how the alembic revision
    command ultimately outputs new revision scripts.
    Alembic Documentation

    View Slide

  42. def run_migrations_online() -> None:
    connectable = engine_from_config(
    config.get_section(config.config_ini_section),
    prefix="sqlalchemy.",
    poolclass=pool.NullPool,
    )
    with connectable.connect() as connection:
    context.configure(
    connection=connection,
    target_metadata=target_metadata,
    process_revision_directives=autogen.process_revision_directives,
    )
    with context.begin_transaction():
    context.run_migrations()
    env.py

    View Slide

  43. _ec_dispatcher = Dispatcher()
    def process_revision_directives(context, revision, directives):
    directives[:] = list(_assign_directives(context, directives))
    def _assign_directives(context, directives, phase=None):
    for directive in directives:
    decider = _ec_dispatcher.dispatch(directive)
    if phase is None:
    phases = ('expand', 'contract')
    else:
    phases = (phase,)
    for phase in phases:
    decided = decider(context, directive, phase)
    if decided:
    yield decided
    autogen.py (part 1)

    View Slide

  44. @_ec_dispatcher.dispatch_for(ops.MigrationScript)
    def _migration_script_ops(context, directive, phase):
    op = ops.MigrationScript(
    new_rev_id(),
    ops.UpgradeOps(
    ops=list(
    _assign_directives(context, directive.upgrade_ops.ops, phase)
    )
    ),
    ops.DowngradeOps(ops=[]),
    message=directive.message,
    head=f'{phase}@head',
    )
    if not op.upgrade_ops.is_empty():
    return op
    autogen.py (part 2)

    View Slide

  45. @_ec_dispatcher.dispatch_for(ops.AddConstraintOp)
    @_ec_dispatcher.dispatch_for(ops.CreateIndexOp)
    @_ec_dispatcher.dispatch_for(ops.CreateTableOp)
    @_ec_dispatcher.dispatch_for(ops.AddColumnOp)
    def _expands(context, directive, phase):
    if phase == 'expand':
    return directive
    else:
    return None
    autogen.py (part 3)

    View Slide

  46. @_ec_dispatcher.dispatch_for(ops.DropConstraintOp)
    @_ec_dispatcher.dispatch_for(ops.DropIndexOp)
    @_ec_dispatcher.dispatch_for(ops.DropTableOp)
    @_ec_dispatcher.dispatch_for(ops.DropColumnOp)
    def _contracts(context, directive, phase):
    if phase == 'contract':
    return directive
    else:
    return None
    autogen.py (part 4)

    View Slide

  47. @_ec_dispatcher.dispatch_for(ops.AlterColumnOp)
    def _alter_column(context, directive, phase):
    is_expand = phase == 'expand'
    if is_expand and directive.modify_nullable is True:
    return directive
    elif not is_expand and directive.modify_nullable is False:
    return directive
    else:
    raise NotImplementedError(
    "Don't know if operation is an expand or contract at the moment: "
    "%s" % directive
    )
    autogen.py (part 5)

    View Slide

  48. @_ec_dispatcher.dispatch_for(ops.ModifyTableOps)
    def _modify_table_ops(context, directive, phase):
    op = ops.ModifyTableOps(
    directive.table_name,
    ops=list(_assign_directives(context, directive.ops, phase)),
    schema=directive.schema,
    )
    if not op.is_empty():
    return op
    autogen.py (part 6)

    View Slide

  49. ❯ alembic revision --autogenerate ...

    View Slide

  50. # revision identifiers, used by Alembic.
    revision = '1630422c66fe'
    down_revision = '9c36df1b3f62'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    op.add_column(
    'user_account', sa.Column('name', sa.String(), nullable=True)
    )
    1630422c66fe_merge_user_names.py

    View Slide

  51. # revision identifiers, used by Alembic.
    revision = 'a142d030afc5'
    down_revision = 'e629a4ea0677'
    branch_labels = None
    depends_on = None
    def upgrade() -> None:
    op.drop_column('user_account', 'last_name')
    op.drop_column('user_account', 'first_name')
    a142d030afc5_merge_user_names.py

    View Slide

  52. ❯ alembic upgrade expand
    ❯ alembic upgrade contract

    View Slide

  53. Next Steps & Wrap Up

    View Slide

  54. We can improve this further…

    View Slide

  55. We can improve this further…
    Distinguish expand/contract migrations by filename

    View Slide

  56. ❯ ls migrations/version
    1630422c66fe_merge_user_names.py
    6cb93d555e2b_initial_models.py
    9c36df1b3f62_initial_expand.py
    a142d030afc5_merge_user_names.py
    e629a4ea0677_initial_contract.py

    View Slide

  57. ❯ ls migrations/version
    1630422c66fe_merge_user_names.py
    6cb93d555e2b_initial_models.py
    9c36df1b3f62_initial_expand.py
    a142d030afc5_merge_user_names.py
    e629a4ea0677_initial_contract.py

    View Slide

  58. We can improve this further…
    Distinguish expand/contract migrations by filename
    Add checkpoints (e.g. version 1.0, 1.1, …)

    View Slide

  59. ❯ alembic upgrade expand
    ❯ alembic upgrade contract

    View Slide

  60. ❯ neutron-db-manage --expand zed
    ❯ neutron-db-manage --contract zed

    View Slide

  61. We can improve this further…
    Distinguish expand/contract migrations by filename
    Add checkpoints (e.g. version 1.0, 1.1, …)
    Add tests

    View Slide

  62. that.guru/blog/zero-downtime-upgrades-with-alembic-and-sqlalchemy/

    View Slide

  63. Credits
    Cover photo by Yunus Tuğ on Unsplash
    Source code based OpenStack Neutron code, licensed under Apache 2.0

    View Slide