Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Vitess: Online Schema Migration Automation

Vitess: Online Schema Migration Automation

For many, running an online schema migration operation is still a manual job: from building the correct command, through identifying where the migration should run and which servers are to be affected, to auditing progress and completing the migration. Sharded environment poses an additional burden, as any logical migration must be applied multiple times, once for each shard.

What if you could just issue an ALTER TABLE ... statement, and have all that complexity automated away?

Vitess, an open source sharding framework for MySQL, is in a unique position to do just that. This session shows how Vitess's proxy/agent/topology architecture, together with gh-ost or pt-online-schema-change, are used to hide schema change complexity, and carefully schedule and apply schema migrations.

Shlomi Noach

October 21, 2020
Tweet

More Decks by Shlomi Noach

Other Decks in Programming

Transcript

  1. PerconaLive EU 2020
    Vitess: online
    schema migration
    automation
    Shlomi Noach

    View Slide

  2. Engineer at PlanetScale
    Author of orchestrator, gh-ost, freno
    and others
    Blog at openark.org
    github.com/shlomi-noach
    @ShlomiNoach
    About me

    View Slide

  3. • Founded in February 2018
    • Venture backed: a16z, SignalFire
    • ~40 employees distributed
    • Authors Vitess
    • Offers PlanetScaleDB for cloud
    offering and PlanetScaleDB BYOK:
    Bring Your Own Kubernetes
    PlanetScale

    View Slide

  4. Vitess
    • CNCF graduated project
    • Maintained by the community
    • PlanetScale is a major contributor
    A database clustering system for
    horizontal scaling of MySQL

    View Slide

  5. Vitess architecture
    Consider a simple replication topology

    View Slide

  6. Vitess architecture
    vttablet
    • A daemon/sidecar
    • Controls the mysqld process
    • Interacts with the mysqld server
    • Typically on same host as mysqld

    View Slide

  7. Vitess architecture

    View Slide

  8. Vitess architecture
    vtgate
    vtgate
    • A smart proxy
    • Speaks mysql protocol
    • Impersonates the
    database
    • Relays queries to
    tablets

    View Slide

  9. Vitess architecture
    vtgate
    • Stateless
    • Scale out

    View Slide

  10. Vitess architecture
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded

    View Slide

  11. Vitess architecture
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    USE commerce
    SELECT * FROM orders
    WHERE customer_id=4

    View Slide

  12. Vitess architecture
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    topo
    • Vitess state backend
    • etcd/zk/consul
    • Info:
    • Sharding schemes
    • Schemas
    • Clusters
    • Tablets, servers,
    • Operations
    • ...

    View Slide

  13. Vitess architecture
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    vtctld
    • Daemon
    • API server
    • Operations

    View Slide

  14. Vitess architecture
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo

    View Slide

  15. Mystery card
    ?
    A hidden truth

    View Slide

  16. • ALTER TABLE is blocking
    Issues with schema migrations

    View Slide

  17. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    Installing, formalizing, command line
    options

    View Slide

  18. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    • Discovery
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded

    View Slide

  19. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    • Discovery
    • Running
    commerce
    shard 0
    commerce
    shard 1

    View Slide

  20. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    • Discovery
    • Running
    • Throttling commerce
    shard 0
    commerce
    shard 1

    View Slide

  21. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    • Discovery
    • Running
    • Throttling
    • Visibility
    commerce
    shard 0
    commerce
    shard 1

    View Slide

  22. Issues with schema migrations
    • ALTER TABLE is blocking
    • gh-ost, pt-online-schema-change,
    fb-osc as alternatives
    • Discovery
    • Running
    • Throttling
    • Visibility
    • Control
    commerce
    shard 0
    commerce
    shard 1

    View Slide

  23. Vitess’ architecture has the knowledge
    and the capability to address the above.
    Issues with schema migrations

    View Slide

  24. Vitess Online DDL
    ALTER WITH ‘gh-ost’ TABLE
    orders ADD COLUMN ts TIMESTAMP
    ZeroDependenciesNoReally™

    View Slide

  25. Vitess Online DDL
    ALTER WITH ‘pt-osc’ TABLE
    orders ADD COLUMN ts TIMESTAMP
    ZeroDependenciesNoReally™

    View Slide

  26. Vitess Online DDL
    DEMO

    View Slide

  27. Vitess Online DDL
    What just happened?

    View Slide

  28. Online DDL flow
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    vtgate
    • Intercepts
    • Parses
    • Persists

    View Slide

  29. Online DDL flow
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    topo
    • Stores

    View Slide

  30. Online DDL flow
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    vtctld
    • Observes
    • Discovers
    • Distributes
    • Validates

    View Slide

  31. Online DDL flow
    commerce
    shard 0
    commerce
    shard 1
    internal
    unsharded
    topo
    vttablet
    • Has all the fun

    View Slide

  32. vttablet on primary
    • Schedules
    • Prepares script(s)
    • Runs gh-ost or pt-osc
    • Tracks
    • Throttles
    • Cleans up
    • Tracks state in local _vt meta schema
    Online DDL flow

    View Slide

  33. Vitess Online DDL
    DEMO, continued

    View Slide

  34. Vitess Online DDL
    ZeroDependenciesNoReally™

    View Slide

  35. Q: Which MySQL account and
    privileges do I need for the migration
    user? How do I tell Vitess the
    credentials?
    A: Vitess creates a migration user with
    random password for each migration, on
    your behalf.
    You do not need to do anything.

    View Slide

  36. Q: How do I setup the gh-ost hooks?
    A: Vitess creates gh-ost hooks directory
    and scripts for you.
    Vitess creates a pt-osc plugin for you.
    You do not need to do anything.

    View Slide

  37. Q: Where do I install gh-ost?
    A: On linux amd64 and with recent glibc,
    Vitess comes with gh-ost binary
    pre-compiled and embedded, and will
    extract the binary on your behalf.
    You do not need to do anything.

    View Slide

  38. Q: Where do I install gh-ost? (Cont.)
    Otherwise use -gh-ost-path.
    pt-online-schema-change is not
    bundled. Install it on all MySQL servers.
    Use -pt-osc-path

    View Slide

  39. Q: How do I configure the throttling
    replicas?
    A: Each vttablet runs a throttler service,
    which automatically and dynamically
    detects the throttling replicas. It
    evaluates replication lag autonomically.
    Both gh-ost and pt-osc are configured
    to use it.
    You do not need to do anything (but you
    can configure -throttle_tablet_types="replica,rdonly"
    )

    View Slide

  40. Q: How do I drop pt-osc leftover
    triggers in case of failure?
    A: Vitess will identify a failed migration,
    even if vttablet itself fails, and will drop
    the triggers on your behalf.
    You do not need to do anything.

    View Slide

  41. Q: How do I drop the leftover tables?
    A: Each primary vttablet runs a table
    lifecycle process. It will collect the
    leftover tables (whether migration is
    successful or failed) and will safely and
    slowly (using the throttler mechanism)
    purge and drop them.
    You do not need to do anything.

    View Slide

  42. Mystery card reveal
    ?
    A hidden truth,
    In something not said

    View Slide

  43. Vitess as an
    infrastructure
    framework
    None of the above requires you to run
    traffic through Vitess.
    You may keep running your production
    traffic as normal, if you wish, and only
    use Vitess for infrastructure.

    View Slide

  44. • GitHub and GitHub team:
    • github.com/github/gh-ost
    • github.com/github/freno
    • Lessons learned
    • Percona
    • percona.com/doc/percona-toolkit
    • Community
    • Vitess maintainers
    • PlanetScale team
    Acknowledgements

    View Slide

  45. Resources
    • Docs: vitess.io/docs
    • Code: github.com/vitessio/vitess
    • Slack: vitess.slack.com
    • Demo:
    asciinema play https://git.io/JUhMa

    View Slide

  46. Questions?
    github.com/shlomi-noach
    @ShlomiNoach
    Thank you!

    View Slide