Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Migrating Shopify's Core Rails Monolith to Trilogy

Migrating Shopify's Core Rails Monolith to Trilogy

Trilogy is a client library for MySQL-compatible database servers. It was open sourced along with an Active Record adapter by GitHub this past year. With promises of improved performance, better portability and compatibility, and fewer dependencies, Shopify’s Rails Infrastructure team decided to migrate our core Rails monolith from Mysql2 to Trilogy.

In this talk, we’ll embark on a journey involving database clients, Active Record adapters, and open source contributions. We’ll learn a bit about MySQL protocols, dig into how the Active Record adapter manages the Trilogy client under the hood, and look at some of the missing features we implemented as we moved from Mysql2 to Trilogy. Finally, we’ll discuss the end result of migrating to Trilogy, and the impact it had on Shopify’s monolith in production.

Adrianna Chang

May 01, 2023
Tweet

Other Decks in Programming

Transcript

  1. Hi, i’m adrianna! Senior Software Engineer Based in Ottawa, ON

    󰎟 Working with Rails since 2016 ❤ Rails Triage Team Passionate about OSS and making Rails more diverse! 🏻 ♀ ☀
  2. Trilogy db client 1 An overview of Trilogy, and why

    you might want to use it. Building a feature in trilogy 2 Adding a new feature to Trilogy that we needed for our application. Migrating to trilogy 3 Changes we made to our app code to get CI passing with Trilogy. Deploying trilogy 4 How we deployed a large change like this to production, and what the results were. On the agenda today
  3. Trilogy db client • Custom implementation of MySQL protocol •

    Supports most frequently used parts of the protocol ◦ Authentication handshake ◦ Ping, query, and quit commands • Designed for flexibility, performance and ease of embedding Check out Matthew Draper’s article on GitHub’s engineering blog: https://github.com/github/trilogy • MySQL database client with Ruby bindings • Open sourced by GitHub in 2022
  4. Why switch? • Minimal dependencies required for compilation ◦ No

    dependency on libmysqlclient / libmariadb, which means: ▪ Simpler installation ▪ Can eliminate client / server version mismatch issues ▪ Minimizes times data must be copied when handling packets • Designed to perform efficiently in context of Ruby VM ◦ Conscious use of dynamic memory allocation ◦ API designed to use non-blocking operations and IO callbacks where possible
  5. CAVEATS • A few incompatibilities with MySQL 8.0 • Not

    as feature-complete as Mysql2 / libmysqlclient • Newly open-sourced library ◦ Looking for an open source project to contribute to? Consider helping us out! 🫶
  6. Trilogy + • Repo: github/activerecord-trilogy-adapter • Database config pointing to

    trilogy • Gems: Trilogy DB client, Active Record adapter
  7. Why shopify made the switch • Better developer experience •

    Speedups in query performance • Strong maintainership • Upstream Trilogy to rails/rails • Make Trilogy a community standard
  8. Mysql client / server protocol Client Server Connects Handshake packet

    Authentication packet OK or ERR • Server listens for connections on TCP/IP port or local socket • When a client connects, the client and server go through an authentication phase 1) Server checks whether host is allowed to connect 2) If allowed, server sends handshake packet 3) Client sends back authentication packet 4) Server verifies information, responds with OK (success) or ERR
  9. Mysql client / server protocol Command packet Result Client Server

    • If the authentication phase is successful, the client can enter query phase and issue commands • Many type of commands: ◦ COM_PING ◦ COM_QUERY ◦ COM_QUIT Examples of other types of commands available in the query phase: • Server responds with result: ◦ OK, EOF, ERR or result set
  10. Protocol capabilities • During the authentication phase, the client and

    server exchange info about what they’re able to do via a bitmask of protocol capabilities Client Server Connects Handshake packet Authentication packet OK or ERR Interested in learning more? Check out the MySQL docs on capabilities flags: • Examples: ◦ CLIENT_CONNECT_WITH_DB ◦ CLIENT_SSL ◦ CLIENT_MULTI_STATEMENT
  11. ”MySQL supports the execution of a string containing multiple statements

    separated by semicolon (;) characters. This capability is enabled by special options specified either when you connect to the server [...] or after connecting by calling set_server_option().“ MySQL docs on multi-statement execution support: Multi-statement support
  12. Multi-statement support ”MySQL supports the execution of a string containing

    multiple statements separated by semicolon (;) characters. This capability is enabled by special options specified either when you connect to the server [...] or after connecting by calling set_server_option().“ Multi-statement support
  13. Protocol Capabilities ”During the authentication phase, the client and server

    exchange info about what they’re able to do.“ Client Server Connects Handshake packet Authentication packet OK or ERR Flag: CLIENT_MULTI_STATEMENTS Hex Value: 0x10000 When set, indicates that the client can send more than one statement in one query. Multi-statement support
  14. Multi-statement support • Fixture insertion! ◦ Reducing queries required to

    insert fixtures • Multi-statement queries can be more performant ◦ Shopify: ~1000 fixture sets, some with 100s of fixtures Multi-statement support Why does this matter for Rails?
  15. • #multi_statements_enabled? ◦ Are multi-statements enabled? A look at the

    Mysql2 adapter in Rails… • #with_multi_statements ◦ Runs block in multi-statement context Multi-statement support
  16. • #execute_batch ◦ Combines statements into smaller number of SQL

    strings • #abandon_results! ◦ Get connection back to usable state A look at the Mysql2 adapter in Rails… ◦ Adds SQL chunks up up to @@max_allowed_packet Multi-statement support
  17. MySQL docs on multi-statement execution support: ”MySQL supports the execution

    of a string containing multiple statements separated by semicolon (;) characters. This capability is enabled by special options specified either when you connect to the server [...] or after connecting by calling set_server_option().“ Multi-statement support
  18. Reminder: C Extension Ruby Trilogy C API • Commands in

    C API split into send_ and recv_ functions • trilogy_set_option_send: builds set option packet and writes to socket Multi-statement support
  19. C Extension Ruby Trilogy C API Builds a COM_SET_OPTION packet

    with the server option to set COM_SET_OPTION packet Result Client Server Multi-statement support
  20. Multi-statement support • Commands in C API split into send_

    and recv_ functions • trilogy_set_option_recv: reads response packet from server C Extension Ruby Trilogy C API
  21. Calls #trilogy_set_option_send with the specified option • Expose #set_server_option in

    Ruby Waits for response from server via #trilogy_set_option_recv Returns true on success, handles timeouts / errors Multi-statement support C Extension Ruby Trilogy C API • In our C extension…
  22. Multi-statement support • #multi_statements_enabled? ◦ Are multi-statements enabled? Now with

    the Trilogy adapter… • #with_multi_statements ◦ Runs block in multi-statement context
  23. Original PR for multi-statement support in Trilogy: Support for setting

    server options in Trilogy: Adding multi-statement capabilities to Trilogy Active Record adapter: Multi-statement support
  24. Goal: have a green ci build with trilogy Step 1:

    Set application up to use Trilogy
  25. Step 2: Address small API changes in client Goal: have

    a green ci build with trilogy Mysql2::Result Trilogy::Result
  26. Goal: have a green ci build with trilogy Step 2:

    Address small API changes in client
  27. Step 3: Handle error class changes Mysql2::Timeout Error Trilogy::Timeout Error

    Mysql2::Connection Error Trilogy::Connection Error Goal: have a green ci build with trilogy
  28. • Core Rails application is a modular monolith Shopify Core

    • BFCM 2022: 14 million queries per second at peak • Database infrastructure ◦ Database is horizontally partitioned ◦ MySQL instances run on Google Cloud Platform VMs managed by Chef ◦ ProxySQL ▪ ~100 000 client connections to ~2000 backend connections
  29. Step 1: Attempt to run Trilogy DB client on 1%

    of production • Export ENV var on 1% of production pods • Switch out adapter in config/database.yml based on ENV var
  30. Step 1: Attempt to run Trilogy DB client on 1%

    of production • Export ENV var on 1% of production pods • Switch out adapter in config/database.yml based on ENV var • Ensure application code supported both Mysql2 and Trilogy
  31. Step 1: Attempt to run Trilogy DB client on 1%

    of production • Separate CI pipeline for Trilogy ◦ Too expensive! • Remove Mysql2 test coverage ◦ Not ideal, 99% of prod using Mysql2 • Half of CI on Trilogy, half with Mysql2 ◦ Difficult to spot bugs Decided to keep test coverage on Mysql2, and run CI against Trilogy daily How to run tests?
  32. Step 2: Try running Trilogy on 50% of production What

    happened?! • If client capabilities are different, ProxySQL needs to perform COM_CHANGE_USER • ProxySQL keeps pool of backend connections Shopify Core Mysql2 Trilogy COM_CHANGE_USER • Using clients with different protocol options overwhelmed DB with change user commands
  33. Step 2: Try running Trilogy on half of production •

    Queried a ProxySQL pod’s connection list • Mysql2 was setting CLIENT_MULTI_RESULTS, but Trilogy was not! Flag: CLIENT_MULTI_RESULTS Hex Value: 0x20000 When set, tells the server it's capable of handling multiple result sets from a query.
  34. Step 3: Attempt to run Trilogy subset of production… again

    • Targeted 100% of single cluster instead of 50% of production • Confirmed that the connection options matched between Trilogy and Mysql2 • Collected some metrics Request time: • Mysql2: Avg 3.46 ms • Trilogy: Avg 2.70ms ~22% faster
  35. MySQL query time: • Mysql2: Avg 1.49ms • Trilogy: Avg

    1.24 ms ~ 17% faster Step 3: Attempt to run Trilogy subset of production… again • Targeted 100% of single cluster instead of 50% of production • Confirmed that the connection options matched between Trilogy and Mysql2 • Collected some metrics
  36. Step 4: Run Trilogy on 100% of production • No

    issues running Trilogy at 100% on one of our European clusters • Took the plunge and migrated fully to Trilogy • Changed tests to run against Trilogy instead of Mysql2 • Success! 🎉
  37. What’s next? • Give Trilogy a try in a new

    application! • Try migrating an existing application from Mysql2 to Trilogy. • Consider contributing to the Trilogy client, or the adapter (now in Rails!) • Flex your new MySQL knowledge on all your coworkers 😂
  38. Thanks! ❤ adriannakchang adriannachang.me Presentation template by Slidesgo, icons by

    Flaticon. @paarthmadan @eileencodes @casperisfine @matthewd @composerinteralia @jhawthorn @HParker @brianmario @tenderlove