Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managing and Visualizing your Replication Topologies with Orchestrator

Shlomi Noach
September 26, 2015

Managing and Visualizing your Replication Topologies with Orchestrator

Introducing Orchestrator: a MySQL replication topology management service, that greatly simplifies DBA's tasks and enhances visibility on your topologies.

Shlomi Noach

September 26, 2015
Tweet

More Decks by Shlomi Noach

Other Decks in Technology

Transcript

  1. Agenda: • What? Why? • The killer demo • Breakdown:

    discovery, refactoring, recovery, interface • Pseudo GTID, GTID, binlog servers • Architecture & stack • Deployment at scale @ Booking.com • The CLI demo • Supported/unsupported • Contributing Managing and Visualizing … yada yada ...Orchestrator
  2. Not a sales pitch • orchestrator is free and open

    source • Designed to be as generic as possible • Some company specific rules or processes externalized via configuration https://github.com/outbrain/orchestrator 3
  3. What? Why? • With so many replication topologies; with many

    servers per topology, spanning multiple data centers; with the periodic server failures and movements, - Do you know how your topologies look like? - Does management know? • With the complexity of moving slaves around the topology; the rules allowing/disallowing server X to replication from Y; with the implications of cross-DC traffic on slave latency, - Who in your company can refactor your topologies other than yourself? • In the event of server failure, master or intermediate master breakage, - Do you have a clear visual into what fails? - What kind of solutions do you use? - Who can execute failover / override a failover / understand what’s going on? 4
  4. The killer demo or Let’s break our production servers right

    now and deal with the consequences once this conference is over
  5. Orchestrator breakdown: Discovery • Crawls through your topologies • Automatically

    recognizes new servers • Resolves IPs, CNAMEs • Revisits your servers periodically • Collects data (version, binlogs, replication, …) 7
  6. Orchestrator breakdown: Refactoring • Understands: • binlog file:pos • Pseudo-GTID

    • GTID (Oracle + MariaDB) • binlog servers • Knows the rules for replication X from Y • Will refactor your topology for you: safely redesign your topology • Fine grained control or “just do it for me, I’m too tired to think” • Can refactor via slick web UI • Or via nerdy command line interface 11
  7. Orchestrator breakdown: Recovery • Keeps a state of your topology

    • Uses holistic approach to detect failures http://code.openark.org/blog/mysql/what-makes-a-mysql-server-failurerecovery-case • If replication breaks, orchestrator knows what the expected topology looked like • And can recommend “the next best option”, based on state, not on configuration • And, if you like, can execute an automated/manual failover that heals your topology and leaves no slave (or only those utterly incapable of restoring) behind 12
  8. Orchestrator breakdown: Interface • Command line • Well formed output

    • Go-to if you like your --debug logs • Web API • Simple GET (not REST) • Web UI • Uses Web API • Designed to be friendly 13
  9. Binary logs coordinates: recap • Replication based on file:pos •

    Different file names on masters & slaves • Different positions on masters & slaves • Once the connection is broken, difficult to match up again 14
  10. GTID • Every transaction has a unique identifier • When

    a slave connects to a master, it looks for the last GTID statement it already executed • Available in Oracle MySQL 5.6, MariaDB 10.0 • Completely different implementations; may cause lockup • 5.6 migration path is painful (alleviated in 5.7) • 5.6 requires binary logs & log-slave-updates enabled on all slaves (alleviated in 5.7) • 5.6 errant transactions, unexecuted sequences • GTID will be the requirement in future Oracle features • MariaDB GTID supports domains; easy to use 18
  11. Pseudo GTID • Pseudo GTID offers what GTID offers, without

    GTID. This includes: • Slave repointing • Failover schemes • With less requirements • Bulk operations • Without upgrading your servers; without installing anything on them; in short: not touching your beloved existing setup • No vendor lockdown; no migration paths 19
  12. Pseudo GTID • Application-side enhancement • We inject a uniquely

    identified statement every X seconds. We call it Pseudo GTID. • Pseudo GTID statements are searchable and identifiable in binary and relay logs • Make for “markers” in the binary/relay logs • Injection can be made via MySQL event scheduler or externally • Otherwise non intrusive. No changes to topology/versions/methodologies 20
  13. Injecting Pseudo GTID create event if not exists create_pseudo_gtid_event on

    schedule every 5 second starts current_timestamp on completion preserve enable do begin set @pseudo_gtid_hint := uuid(); set @_create_statement := concat('drop ', 'view if exists `meta`.`_pseudo_gtid_hint__', @pseudo_gtid_hint, '`'); PREPARE st FROM @_create_statement; EXECUTE st; DEALLOCATE PREPARE st; end $$ 21
  14. In the binary logs mysql> show binlog events in 'mysql-bin.015631'

    \G ... Log_name: mysql-bin.015631 Pos: 1632 Event_type: Query Server_id: 1 End_log_pos: 1799 Info: use `meta`; drop view if exists `meta`.`_pseudo_gtid_hint__50731a22-9ca4- 11e4-aec4-e25ec4bd144f` ... 22
  15. MySQL binary & relay logs: Pseudo GTID injection 25 Master

    Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert
  16. insert > PGTID 17 update delete create > PGTID 82

    delete delete > PGTID 56 insert insert Pseudo GTID: repoint, based on binary logs 26 Master Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update
  17. insert > PGTID 17 update delete create > PGTID 82

    delete delete > PGTID 56 insert insert update drop Pseudo GTID: repoint, based on relay logs 27 Master Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update
  18. • A MySQL-server-like entity • Which merely relays the master’s

    binary logs • Under same name and position • Nested binlog servers allow for simplified refactoring and offer a simplified & faster master recovery mechanism • See Binlog Servers @ Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom • Orchestrator supports: • hybrid standard + binlog-server replication topologies • Pure binlog server topologies Binlog Servers 29
  19. • Can execute as long-running service • Provides HTTP UI,

    Web API • Polls servers, checks for crashes, recovers, periodic operations • Leader election • Can run as command line • Issue a single command & exit • Requires (same, single) MySQL backend for any operation. • Backend database has the state of topologies • orchestrator itself mostly stateless (pending operation excluded, optimistic mode) • Agent-less for most operations; communicates directly with MySQL instances Orchestrator architecture 30
  20. • HA: orchestrator leader election • Self healing backend data

    • All locks auto expiring • Support authentication (basic-auth, reverse proxy) • Operations friendly, e.g.: • Server maintenance flag • Downtiming servers • Marking as “best candidate” Orchestrator architecture 31
  21. • Stack: • golang - in retrospect a very good

    choice: a lot of concurrency; easy deployment; rapid development • MySQL as backend database (duh) • go-martini web framework • Page generation via dirty JavaScript/jQuery (sue me) • Twitter bootstrap • Graphs via D3 • Development: • Github, open source; accepting pull-requests https://github.com/outbrain/orchestrator/ Orchestrator stack & development 33
  22. MySQL @ Booking.com 2015 • We are a big MySQL

    shop • We have ALOT production servers on ALOT topologies (aka chains, aka clusters) • As small as 1 server per topology, as large as hundreds of servers per topology • Two major data centers, now populating our third • Single master, plenty slaves • All chains are deployed with Pseudo-GTID and controlled by orchestrator • Larger chains: hybrid, normal + binlog servers topologies (complex!) • “Pure” binlog-server topologies experimental, non-production • Some topologies sharded • A little bit of active/passive master-master 34
  23. Orchestrator architecture @ Booking.com 35 app leader app app app

    HTTP load balancer orchestrator-cli on all MySQL nodes
  24. Orchestrator @ Booking.com 2015 • 5-6 hosts running orchestrator service,

    one is elected as leader at any given time • ALOT hosts with orchestrator as CLI • Single elected service polls all our instances • Each MySQL instance polled every 30s • Pseudo-GTID deployed on all chains • Orchestrator configured to auto-recover the death of any intermediate master • Orchestrator configured to auto recover from some master failures • Both the above happen • Some checks & dashboards rely on orchestrator data (API / DB) • Some operations rely on orchestrator logic 36
  25. CLI demo Where real stuff gets done including pie charts

    ..,.. .M +.. .....N$7$$O$77MZN?.. .~8?: M+8.. ...~7$7777$77$$$$$$$$$7D......=I+?M. .M?+=MMM$$Z$Z$$7$$$7$$7$77$$$$7$$$MZI+=I= . .~III7$$$$$$7777$$$$777777777$$$$$7$7?IM . .M7Z$$777777777777777777777$$$ZD+Z?M. MZ$$77777=7777777777$$Z7777$Z8OMMN~. .M,M8$7777O.MI7$$7IONMMMMMZ777$N.8DM,NM8.,.... ... M+D.MZ7777Z~M88OOODNZZZZZ8NM$77NM=..MD7$Z7$ZM:..,, ..~:DO777IMZZ$777$7777777$$ZZOMODNNNZ$7ZZ$ZZZZ$ZZZZZ8ONMM.. ..MMNZ$77M8Z$77777777777777777$ZM$7I7I$ZZ$ZZOO$OZO$ZZZZZZZ8M... M777778MZZ777777777777777777777$ZN$777$Z$Z$Z$ZOZOOZOZZZZ$7$$Z$ZMDI$7ZM.. I777$NZZ777777777777777777777777$ZOD77$Z$$ZZZZOZ$$$O$$$ZZZZZZZOZZOOZOZZOZ... M7$7MO$777777777777777777777777777O8M77Z$ZZZZZ$ZZ$Z$$$$ZZ$ZZZ$$$ZZZZZ$ZZZZ8M. OI$77$877777777777777777777777777777$ZZI$$$ZZ$ZZZZZZZZ$$ZZ$Z$$$ZZZZ$$ZZZZ$$O$M.. .,$$$I777777777777777777777777777777777$Z7$$$$ZZZZZ$Z$ZZZZ$$ZZ$ZZZ$ZZZ$$$ZZZZZZZI8. M777777777777777777777777777777777777777$$$7$ZZZZ$ZZZZZZZZZZZZ7$ZZZZZZZ$ZZZZZ$$$ZO,. .NZ7I77777777777777777777777777777777777777$$ZZZZZ$$ZZZZ7ZZZZZ$Z$$$$$Z$$$Z$Z7$Z$OODMM ~$$7II77777777777777777777777777777777777777$Z$$ZZ$$ZZ$Z$$$$Z$ZZZZ7$$Z$Z$$$Z$ZZ777$.. ,ZZ$7III7777777777777777777777777777777I77777Z$$$$$$$$$$$Z$$$Z$77$ZZ$Z$$$$$Z$ZZ7$$77M,.. .NZMZ$III777777777777777777777777777777III7777Z$$Z7$$$$77$$$$Z$$Z$7$$$$$$$Z77$$$$7$$$$M:... ,:.MZ$IIII7777777777777777777777777777IIII7777ZZ$$$$$$$777$Z77Z7Z$$$$Z$Z$$$$Z$$$$$7$$$7,:. .NOI7III777777777777777777777777III7IIIII777$$$$$$$$$$Z$$$$$$$7$$$$$$$7$$7$$$$$$$$$$I8. OZOI77II77777777777777777777777IIIIIIII7777Z$$7$7Z$$$7$7$$$7$$$$$$$$$$$$Z7$$Z77777$77$.. .,777$7III777777777777777777IIIIIIIIIII77777Z$$$$7$$$Z7$$$$$$7$Z$$$$$7$Z$7Z$$$$7777II7$M. .:$MI$$I$7I77I7I7777777777777IIIIII?II777777$$$$$$$$7$$$7$$$$777II7$$777Z$$Z$$$77$I$7I7+. .:N I7O7$$IIII7I7777777777IIIIIII???77777777$$$$$$$$$$7$77$$$7I77IIIII77$$$$$77II7$I$7$O ?$ZZ$$I?7IIIIIIIIIIIIIIII????I?I777777II$777$7$7$777I$I7II7???I??II7I778777777I77$$M. +8O$I$IIIIIII7III7IIIIII?????I?I7777777II77I7I$$7777777IIII??III?IIIIII7I77777$7777I.. .MMII7I7III7III7II??7?????IIII777777777II77$7III7II777777I7II7I7III77?I?77II77I$7777N ,8OZMII7I7I?7I7????I????I7I77777777777III777II?II7I77I777II?II77777I77I7777II7777777N ,8ONO$Z$77??I??7+?+???IN7777777777777O$$III77III?I7I?II?77?II7777777777?77I77II7777DM. .OZZOONMZ77ZI77??I?+I++7$$777777777777OZ$7I7?IIII?IIIIIIII?IDII7777777777777777I777777M . $$$$ZZZOOMZ7$I$7II?77I7$ZD$77777777777$D$77IIIIIIIIIIIIIIIIZ$$M$$7777777777I777777777777.. ..77$$$$$ZZZOMIII7$$I$$7$$ZZ$777777777777O$$?IIIIII7I7IIIIII7I7$$$MI7777777777777777777777M. =I777$I777$$Z$ZMO777777777$$M$777777777777M$7IIIIIIIIIIIIIIIIIII77$M$I7777777777777777777I7M. .O77777777777$M.. OM.877777$$7$777777777777$Z$7I7IIIIIIIIIIII7I7II77ZO7777777777777777777777. M7777777777777. .877I77$$77777777777777MN7$7IIIIIIIIIIIII?77MD7$D77777777777777777I7777D. NM7M+MM++MNM,,. :MMN77777777777777IMZ7Z?IIIN$I7I.Z8M$M.$ON+M77777I77II7DMM77MM?II7M. ,M. .DM7MDIIMMM77MM.. .8 .=MM. .7 . MN77IMZ7NMM$.M. N .MM..M:..MOOMM.~. ...... ..
  26. orchestrator -c discover -i replica1.host:3306 orchestrator -c topology -i master.host:3306

    orchestrator -c relocate -i replica3.host:3306 -d replica2.host:3306 orchestrator -c relocate -i replica3.host:3306 -d master.host --debug orchestrator -c which-slaves -i master.host | while read i ; do orchestrator -c disable-gtid -i $i ; done orchestrator -c regroup-slaves -i master.host --debug orchestrator -c relocate-slaves -i replica1.host -d master.host orchestrator -c which-slaves -i master.host | while read i ; do orchestrator -c enable-gtid -i $i ; done CLI samples
  27. In-production experiments, trust • Periodic crash experiments • And periodically

    avoiding these experiments as well • Getting more people involved (on call sys admins) • ALOT of input is gained by people inexperienced with MySQL, leading to more visibility on orchestrator’s side • And of course periodic real crash scenarios 39
  28. Supported replication topologies & technologies • Standard binlog file:pos replication

    • GTID (Oracle & MariaDB) • Pseudo GTID • Binlog servers (MaxScale) • Statement based, row based, semi-sync replication • Single master replication • Master-master (2 node circular) replication • 5.7 parallel replication (in-order required for Pseudo-GTID) 40
  29. Unsupported • 5.6 per-schema parallel replication • Discovery & visualization

    good, operations unsupported • Master-master-master (#nodes > 2) replication • Galera • Unrecognized by orchestrator, identifies each co-master as its own head of topology • Multi-master aka multi source (neither Oracle 5.7 nor MariaDB) • Tungsten 41
  30. Contributions & usage • Known to be deployed by various

    companies • Orchestrator accepts pull requests • Please consider making your own PR • Please submit bug reports • Please assist in documentation 42
  31. • Orchestrator manual https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual • Orchestrator deployment https://github.com/outbrain/orchestrator/wiki/Orchestrator-deployment • Orchestrator

    first steps https://github.com/outbrain/orchestrator/wiki/Orchestrator-first-steps • Orchestrator for developers https://github.com/outbrain/orchestrator/wiki/Orchestrator-for-developers • openark.org http://code.openark.org/blog/tag/orchestrator http://code.openark.org/blog/tag/pseudo-gtid • Binlog servers master promotion http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfiguring_slaves.html Links of interest
  32. • Binlog Servers at Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom • Combining Redis and

    MySQL to store HTTP cookie data https://www.percona.com/live/europe-amsterdam-2015/sessions/combining-redis-and-mysql-store-http-cookie-data • Encrypted MySQL Backups and instant recoverability on large scale https://www.percona.com/live/europe-amsterdam-2015/sessions/encrypted-mysql-backups-and-instant-recoverability-large-scale • Events storage and analysis with Riak at Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/events-storage-and-analysis-riak-bookingcom • Riding the Binlog: an in Deep Dissection of the Replication Stream https://www.percona.com/live/europe-amsterdam-2015/sessions/riding-binlog-deep-dissection-replication-stream • Unicode and MySQL https://www.percona.com/live/europe-amsterdam-2015/sessions/unicode-and-mysql • Your Clone Army: Better scalability through more database servers https://www.percona.com/live/europe-amsterdam-2015/sessions/your-clone-army-better-scalability-through-more-database-servers • The CIS MySQL Security Benchmark (LT) https://www.percona.com/live/europe-amsterdam-2015/sessions/cis-mysql-security-benchmark • The Virtues of Boring Technology (Keynote) https://www.percona.com/live/europe-amsterdam-2015/sessions/virtues-boring-technology Other Booking.com talks