Managing and Visualizing your Replication Topologies with Orchestrator

Managing and Visualizing Your Replication Topologies with Orchestrator Shlomi Noach
Percona Live, Sep 2015

Agenda: • What? Why? • The killer demo • Breakdown:
discovery, refactoring, recovery, interface • Pseudo GTID, GTID, binlog servers • Architecture & stack • Deployment at scale @ Booking.com • The CLI demo • Supported/unsupported • Contributing Managing and Visualizing … yada yada ...Orchestrator

Not a sales pitch • orchestrator is free and open
source • Designed to be as generic as possible • Some company specific rules or processes externalized via configuration https://github.com/outbrain/orchestrator 3

What? Why? • With so many replication topologies; with many
servers per topology, spanning multiple data centers; with the periodic server failures and movements, - Do you know how your topologies look like? - Does management know? • With the complexity of moving slaves around the topology; the rules allowing/disallowing server X to replication from Y; with the implications of cross-DC traffic on slave latency, - Who in your company can refactor your topologies other than yourself? • In the event of server failure, master or intermediate master breakage, - Do you have a clear visual into what fails? - What kind of solutions do you use? - Who can execute failover / override a failover / understand what’s going on? 4

The killer demo or Let’s break our production servers right
now and deal with the consequences once this conference is over

Orchestrator breakdown: Discovery • Crawls through your topologies • Automatically
recognizes new servers • Resolves IPs, CNAMEs • Revisits your servers periodically • Collects data (version, binlogs, replication, …) 7

8 Anonymous Booking.com topology

Orchestrator breakdown: Refactoring • Understands: • binlog file:pos • Pseudo-GTID
• GTID (Oracle + MariaDB) • binlog servers • Knows the rules for replication X from Y • Will refactor your topology for you: safely redesign your topology • Fine grained control or “just do it for me, I’m too tired to think” • Can refactor via slick web UI • Or via nerdy command line interface 11

Orchestrator breakdown: Recovery • Keeps a state of your topology
• Uses holistic approach to detect failures http://code.openark.org/blog/mysql/what-makes-a-mysql-server-failurerecovery-case • If replication breaks, orchestrator knows what the expected topology looked like • And can recommend “the next best option”, based on state, not on configuration • And, if you like, can execute an automated/manual failover that heals your topology and leaves no slave (or only those utterly incapable of restoring) behind 12

Orchestrator breakdown: Interface • Command line • Well formed output
• Go-to if you like your --debug logs • Web API • Simple GET (not REST) • Web UI • Uses Web API • Designed to be friendly 13

Binary logs coordinates: recap • Replication based on file:pos •
Different file names on masters & slaves • Different positions on masters & slaves • Once the connection is broken, difficult to match up again 14

MySQL binary & relay logs 15 Master Slave

MySQL binary & relay logs: different languages 16 Master Slave

MySQL binary & relay logs: even more languages 17 Master
Slave Slave

GTID • Every transaction has a unique identifier • When
a slave connects to a master, it looks for the last GTID statement it already executed • Available in Oracle MySQL 5.6, MariaDB 10.0 • Completely different implementations; may cause lockup • 5.6 migration path is painful (alleviated in 5.7) • 5.6 requires binary logs & log-slave-updates enabled on all slaves (alleviated in 5.7) • 5.6 errant transactions, unexecuted sequences • GTID will be the requirement in future Oracle features • MariaDB GTID supports domains; easy to use 18

Pseudo GTID • Pseudo GTID offers what GTID offers, without
GTID. This includes: • Slave repointing • Failover schemes • With less requirements • Bulk operations • Without upgrading your servers; without installing anything on them; in short: not touching your beloved existing setup • No vendor lockdown; no migration paths 19

Pseudo GTID • Application-side enhancement • We inject a uniquely
identified statement every X seconds. We call it Pseudo GTID. • Pseudo GTID statements are searchable and identifiable in binary and relay logs • Make for “markers” in the binary/relay logs • Injection can be made via MySQL event scheduler or externally • Otherwise non intrusive. No changes to topology/versions/methodologies 20

Injecting Pseudo GTID create event if not exists create_pseudo_gtid_event on
schedule every 5 second starts current_timestamp on completion preserve enable do begin set @pseudo_gtid_hint := uuid(); set @_create_statement := concat('drop ', 'view if exists `meta`.`_pseudo_gtid_hint__', @pseudo_gtid_hint, '`'); PREPARE st FROM @_create_statement; EXECUTE st; DEALLOCATE PREPARE st; end $$ 21

In the binary logs mysql> show binlog events in 'mysql-bin.015631'
\G ... Log_name: mysql-bin.015631 Pos: 1632 Event_type: Query Server_id: 1 End_log_pos: 1799 Info: use `meta`; drop view if exists `meta`.`_pseudo_gtid_hint__50731a22-9ca4- 11e4-aec4-e25ec4bd144f` ... 22

Recap: MySQL binary & relay logs 23 Master Slave

MySQL binary & relay logs: a virtual contiguous log file
24 Master Slave

MySQL binary & relay logs: Pseudo GTID injection 25 Master
Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert

insert > PGTID 17 update delete create > PGTID 82
delete delete > PGTID 56 insert insert Pseudo GTID: repoint, based on binary logs 26 Master Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update

insert > PGTID 17 update delete create > PGTID 82
delete delete > PGTID 56 insert insert update drop Pseudo GTID: repoint, based on relay logs 27 Master Slave insert > PGTID 17 update delete create > PGTID 82 delete delete > PGTID 56 insert insert update drop update

• Please see https://speakerdeck.com/shlominoach/pseudo-gtid-and-easy-mysql-replication-topology- management to learn about advantages, limitations
and implementation. • Pseudo-GTID deployed on all Booking.com chains More on Pseudo GTID 28

• A MySQL-server-like entity • Which merely relays the master’s
binary logs • Under same name and position • Nested binlog servers allow for simplified refactoring and offer a simplified & faster master recovery mechanism • See Binlog Servers @ Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom • Orchestrator supports: • hybrid standard + binlog-server replication topologies • Pure binlog server topologies Binlog Servers 29

• Can execute as long-running service • Provides HTTP UI,
Web API • Polls servers, checks for crashes, recovers, periodic operations • Leader election • Can run as command line • Issue a single command & exit • Requires (same, single) MySQL backend for any operation. • Backend database has the state of topologies • orchestrator itself mostly stateless (pending operation excluded, optimistic mode) • Agent-less for most operations; communicates directly with MySQL instances Orchestrator architecture 30

• HA: orchestrator leader election • Self healing backend data
• All locks auto expiring • Support authentication (basic-auth, reverse proxy) • Operations friendly, e.g.: • Server maintenance flag • Downtiming servers • Marking as “best candidate” Orchestrator architecture 31

Orchestrator general architecture 32 orchestrator service backend db web API
web xface

• Stack: • golang - in retrospect a very good
choice: a lot of concurrency; easy deployment; rapid development • MySQL as backend database (duh) • go-martini web framework • Page generation via dirty JavaScript/jQuery (sue me) • Twitter bootstrap • Graphs via D3 • Development: • Github, open source; accepting pull-requests https://github.com/outbrain/orchestrator/ Orchestrator stack & development 33

MySQL @ Booking.com 2015 • We are a big MySQL
shop • We have ALOT production servers on ALOT topologies (aka chains, aka clusters) • As small as 1 server per topology, as large as hundreds of servers per topology • Two major data centers, now populating our third • Single master, plenty slaves • All chains are deployed with Pseudo-GTID and controlled by orchestrator • Larger chains: hybrid, normal + binlog servers topologies (complex!) • “Pure” binlog-server topologies experimental, non-production • Some topologies sharded • A little bit of active/passive master-master 34

Orchestrator architecture @ Booking.com 35 app leader app app app
HTTP load balancer orchestrator-cli on all MySQL nodes

Orchestrator @ Booking.com 2015 • 5-6 hosts running orchestrator service,
one is elected as leader at any given time • ALOT hosts with orchestrator as CLI • Single elected service polls all our instances • Each MySQL instance polled every 30s • Pseudo-GTID deployed on all chains • Orchestrator configured to auto-recover the death of any intermediate master • Orchestrator configured to auto recover from some master failures • Both the above happen • Some checks & dashboards rely on orchestrator data (API / DB) • Some operations rely on orchestrator logic 36

CLI demo Where real stuff gets done including pie charts
..,.. .M +.. .....N$7$$O$77MZN?.. .~8?: M+8.. ...~7$7777$77$$$$$$$$$7D......=I+?M. .M?+=MMM$$Z$Z$$7$$$7$$7$77$$$$7$$$MZI+=I= . .~III7$$$$$$7777$$$$777777777$$$$$7$7?IM . .M7Z$$777777777777777777777$$$ZD+Z?M. MZ$$77777=7777777777$$Z7777$Z8OMMN~. .M,M8$7777O.MI7$$7IONMMMMMZ777$N.8DM,NM8.,.... ... M+D.MZ7777Z~M88OOODNZZZZZ8NM$77NM=..MD7$Z7$ZM:..,, ..~:DO777IMZZ$777$7777777$$ZZOMODNNNZ$7ZZ$ZZZZ$ZZZZZ8ONMM.. ..MMNZ$77M8Z$77777777777777777$ZM$7I7I$ZZ$ZZOO$OZO$ZZZZZZZ8M... M777778MZZ777777777777777777777$ZN$777$Z$Z$Z$ZOZOOZOZZZZ$7$$Z$ZMDI$7ZM.. I777$NZZ777777777777777777777777$ZOD77$Z$$ZZZZOZ$$$O$$$ZZZZZZZOZZOOZOZZOZ... M7$7MO$777777777777777777777777777O8M77Z$ZZZZZ$ZZ$Z$$$$ZZ$ZZZ$$$ZZZZZ$ZZZZ8M. OI$77$877777777777777777777777777777$ZZI$$$ZZ$ZZZZZZZZ$$ZZ$Z$$$ZZZZ$$ZZZZ$$O$M.. .,$$$I777777777777777777777777777777777$Z7$$$$ZZZZZ$Z$ZZZZ$$ZZ$ZZZ$ZZZ$$$ZZZZZZZI8. M777777777777777777777777777777777777777$$$7$ZZZZ$ZZZZZZZZZZZZ7$ZZZZZZZ$ZZZZZ$$$ZO,. .NZ7I77777777777777777777777777777777777777$$ZZZZZ$$ZZZZ7ZZZZZ$Z$$$$$Z$$$Z$Z7$Z$OODMM ~$$7II77777777777777777777777777777777777777$Z$$ZZ$$ZZ$Z$$$$Z$ZZZZ7$$Z$Z$$$Z$ZZ777$.. ,ZZ$7III7777777777777777777777777777777I77777Z$$$$$$$$$$$Z$$$Z$77$ZZ$Z$$$$$Z$ZZ7$$77M,.. .NZMZ$III777777777777777777777777777777III7777Z$$Z7$$$$77$$$$Z$$Z$7$$$$$$$Z77$$$$7$$$$M:... ,:.MZ$IIII7777777777777777777777777777IIII7777ZZ$$$$$$$777$Z77Z7Z$$$$Z$Z$$$$Z$$$$$7$$$7,:. .NOI7III777777777777777777777777III7IIIII777$$$$$$$$$$Z$$$$$$$7$$$$$$$7$$7$$$$$$$$$$I8. OZOI77II77777777777777777777777IIIIIIII7777Z$$7$7Z$$$7$7$$$7$$$$$$$$$$$$Z7$$Z77777$77$.. .,777$7III777777777777777777IIIIIIIIIII77777Z$$$$7$$$Z7$$$$$$7$Z$$$$$7$Z$7Z$$$$7777II7$M. .:$MI$$I$7I77I7I7777777777777IIIIII?II777777$$$$$$$$7$$$7$$$$777II7$$777Z$$Z$$$77$I$7I7+. .:N I7O7$$IIII7I7777777777IIIIIII???77777777$$$$$$$$$$7$77$$$7I77IIIII77$$$$$77II7$I$7$O ?$ZZ$$I?7IIIIIIIIIIIIIIII????I?I777777II$777$7$7$777I$I7II7???I??II7I778777777I77$$M. +8O$I$IIIIIII7III7IIIIII?????I?I7777777II77I7I$$7777777IIII??III?IIIIII7I77777$7777I.. .MMII7I7III7III7II??7?????IIII777777777II77$7III7II777777I7II7I7III77?I?77II77I$7777N ,8OZMII7I7I?7I7????I????I7I77777777777III777II?II7I77I777II?II77777I77I7777II7777777N ,8ONO$Z$77??I??7+?+???IN7777777777777O$$III77III?I7I?II?77?II7777777777?77I77II7777DM. .OZZOONMZ77ZI77??I?+I++7$$777777777777OZ$7I7?IIII?IIIIIIII?IDII7777777777777777I777777M . $$$$ZZZOOMZ7$I$7II?77I7$ZD$77777777777$D$77IIIIIIIIIIIIIIIIZ$$M$$7777777777I777777777777.. ..77$$$$$ZZZOMIII7$$I$$7$$ZZ$777777777777O$$?IIIIII7I7IIIIII7I7$$$MI7777777777777777777777M. =I777$I777$$Z$ZMO777777777$$M$777777777777M$7IIIIIIIIIIIIIIIIIII77$M$I7777777777777777777I7M. .O77777777777$M.. OM.877777$$7$777777777777$Z$7I7IIIIIIIIIIII7I7II77ZO7777777777777777777777. M7777777777777. .877I77$$77777777777777MN7$7IIIIIIIIIIIII?77MD7$D77777777777777777I7777D. NM7M+MM++MNM,,. :MMN77777777777777IMZ7Z?IIIN$I7I.Z8M$M.$ON+M77777I77II7DMM77MM?II7M. ,M. .DM7MDIIMMM77MM.. .8 .=MM. .7 . MN77IMZ7NMM$.M. N .MM..M:..MOOMM.~. ...... ..

orchestrator -c discover -i replica1.host:3306 orchestrator -c topology -i master.host:3306
orchestrator -c relocate -i replica3.host:3306 -d replica2.host:3306 orchestrator -c relocate -i replica3.host:3306 -d master.host --debug orchestrator -c which-slaves -i master.host | while read i ; do orchestrator -c disable-gtid -i $i ; done orchestrator -c regroup-slaves -i master.host --debug orchestrator -c relocate-slaves -i replica1.host -d master.host orchestrator -c which-slaves -i master.host | while read i ; do orchestrator -c enable-gtid -i $i ; done CLI samples

In-production experiments, trust • Periodic crash experiments • And periodically
avoiding these experiments as well • Getting more people involved (on call sys admins) • ALOT of input is gained by people inexperienced with MySQL, leading to more visibility on orchestrator’s side • And of course periodic real crash scenarios 39

Supported replication topologies & technologies • Standard binlog file:pos replication
• GTID (Oracle & MariaDB) • Pseudo GTID • Binlog servers (MaxScale) • Statement based, row based, semi-sync replication • Single master replication • Master-master (2 node circular) replication • 5.7 parallel replication (in-order required for Pseudo-GTID) 40

Unsupported • 5.6 per-schema parallel replication • Discovery & visualization
good, operations unsupported • Master-master-master (#nodes > 2) replication • Galera • Unrecognized by orchestrator, identifies each co-master as its own head of topology • Multi-master aka multi source (neither Oracle 5.7 nor MariaDB) • Tungsten 41

Contributions & usage • Known to be deployed by various
companies • Orchestrator accepts pull requests • Please consider making your own PR • Please submit bug reports • Please assist in documentation 42

• Orchestrator manual https://github.com/outbrain/orchestrator/wiki/Orchestrator-Manual • Orchestrator deployment https://github.com/outbrain/orchestrator/wiki/Orchestrator-deployment • Orchestrator
first steps https://github.com/outbrain/orchestrator/wiki/Orchestrator-first-steps • Orchestrator for developers https://github.com/outbrain/orchestrator/wiki/Orchestrator-for-developers • openark.org http://code.openark.org/blog/tag/orchestrator http://code.openark.org/blog/tag/pseudo-gtid • Binlog servers master promotion http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfiguring_slaves.html Links of interest

• http://hyperboleandahalf.blogspot.nl/2010/04/alot-is-better-than-you-at-everything.html • http://www.hbo.com/game-of-thrones • https://imgflip.com/memegenerator/Brace-Yourselves-X-is-Coming • http://www.glassgiant.com/ascii/ • https://www.keepcalm-o-matic.co.uk/
• @isamlambert for making a couple sparkles to ignite this • Team @ Booking.com for ideas, input, time testing, time using • Contributors! Image, sources & other credits

• Binlog Servers at Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/binlog-servers-bookingcom • Combining Redis and
MySQL to store HTTP cookie data https://www.percona.com/live/europe-amsterdam-2015/sessions/combining-redis-and-mysql-store-http-cookie-data • Encrypted MySQL Backups and instant recoverability on large scale https://www.percona.com/live/europe-amsterdam-2015/sessions/encrypted-mysql-backups-and-instant-recoverability-large-scale • Events storage and analysis with Riak at Booking.com https://www.percona.com/live/europe-amsterdam-2015/sessions/events-storage-and-analysis-riak-bookingcom • Riding the Binlog: an in Deep Dissection of the Replication Stream https://www.percona.com/live/europe-amsterdam-2015/sessions/riding-binlog-deep-dissection-replication-stream • Unicode and MySQL https://www.percona.com/live/europe-amsterdam-2015/sessions/unicode-and-mysql • Your Clone Army: Better scalability through more database servers https://www.percona.com/live/europe-amsterdam-2015/sessions/your-clone-army-better-scalability-through-more-database-servers • The CIS MySQL Security Benchmark (LT) https://www.percona.com/live/europe-amsterdam-2015/sessions/cis-mysql-security-benchmark • The Virtues of Boring Technology (Keynote) https://www.percona.com/live/europe-amsterdam-2015/sessions/virtues-boring-technology Other Booking.com talks

Questions? @ShlomiNoach http://openark.org http://blog.booking.com Thank you!

Managing and Visualizing your Replication Topol...

Managing and Visualizing your Replication Topologies with Orchestrator

More Decks by Shlomi Noach

Other Decks in Technology

Featured

Transcript