Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reliable Crash Detection and Failover with Orch...

Reliable Crash Detection and Failover with Orchestrator

The nature of MySQL replication implies various crash scenarios of varying availability impact.
Orchestrator is an open source project that discovers, manages and recovers your MySQL replication.
This talk discusses how Orchestrator detects failures with minimal false positives/negatives, and figures out the best method of recovery even in complex topologies.

- Complex topology types and crash scenarios
- Common crash detection methods
- Configuration based vs. State based recoveries
- The complexity of promotion paths
- Potential post-recovery limbo states
- Flapping & acknowledgements
- Visibility & control

Shlomi Noach

June 03, 2016
Tweet

More Decks by Shlomi Noach

Other Decks in Programming

Transcript

  1. How people build software 1 " 
 Reliable Crash Detection

    and Failover with Orchestrator Shlomi Noach, PerconaLive 2016
  2. How people build software # Agenda • Orchestrator • Topologies,

    crash scenarios • Crash detection methods • Promotion complexity • Limbo states, split brain • Flapping & acknowledgement • Visibility & control • Configuration vs. State based analysis & recovery • State of the orchestra 2 #
  3. How people build software # Orchestrator • MySQL replication topology

    manager • github.com/outbrain/orchestrator • Free & open source 3 #
  4. How people build software # 5 # ! ! !

    ! Simple replication What could possibly go wrong?
  5. How people build software # 7 # ! ! !

    ! Observe/monitor " How do you observe your database availability?
  6. How people build software # 8 # ! ! !

    ! Monitor master only " Common: ping, check :3306, issue SELECT 1
  7. How people build software # 9 # ! ! !

    ! # " And if response is bad? - is this a false positive? - try again - and again? - How many times until you’re sure? How much time have you lost? Monitor master only
  8. How people build software # 10 # ! ! !

    ! Orchestrator’s observation Continuously probes your MySQL servers - Figuring out who replicates from who - Building the topology tree - Understands replication rules - At time of crash, knows what set setup should have been $ $ $ $
  9. How people build software # 11 # ! ! !

    ! Observe entire topology Holistic approach, used by Orchestrator " MySQL monitoring calls for MySQL specific solution - Monitor master and replicas (issue queries) - Check replicas status - Make an analysis based on result from all servers involved.
  10. How people build software # 12 # ! ! !

    ! ! ! ! Multi layered/multi DC replication How do you check an intermediate master (IM) availability? "
  11. How people build software # 13 # ! ! !

    ! ! ! ! Multi layered/multi DC replication " Monitoring the IM and its replicas give the bigger picture - you may actually not care about the IM’s availability as long as its replicas are happy Holistic approach, used by Orchestrator
  12. How people build software # 14 # ! ! !

    ! ! ! ! Dead intermediate master IM unreachable, its replicas are reachable, and are all in agreement their master is unreachable. Orchestrator’s analysis
  13. How people build software # 15 # ! ! !

    ! ! ! ! Dead master Master unreachable, its replicas are, and are all in agreement their master is unreachable. Orchestrator’s analysis
  14. How people build software # 16 # ! ! !

    ! ! ! ! Dead master & some replicas Master unreachable, some of its replicas are, and are all in agreement their master is unreachable. Other replicas are unreachable. Orchestrator’s analysis
  15. How people build software # 17 # ! ! !

    ! ! ! ! Locked master Master is reachable, but does not execute writes. - all replicas are in agreement that master is reachable - no replica is making progress can be handled as a failed master case Orchestrator’s analysis (pending)
  16. How people build software # Recovery & promotion constraints •

    You’ve made the decision to promote a new master • Which one? • Are all options valid? • Is the current state what you think the current state is? 18 #
  17. How people build software # 19 # ! ! !

    ! Promotion constraints most up to date less up to date delayed 24 hours You wish to promote the most up to date replica, otherwise you give up on any replica that is more advanced
  18. How people build software # 20 # ! ! !

    ! Promotion constraints log_slave_updates log_slave_updates No binary logs You must not promote a replica that has no binary logs, or without log_slave_updates
  19. How people build software # 21 # ! ! !

    ! Promotion constraints DC1 DC1 DC2 DC1 You prefer to promote a replica from same DC as failed master
  20. How people build software # 22 # ! ! !

    ! Promotion constraints SBR SBR SBR RBR You must not promote Row Based Replication server on top of Statement Based Replication
  21. How people build software # 23 # ! ! !

    ! 5.6 5.6 5.6 5.7 Promotion constraints Promoting 5.7 means losing 5.6 (replication not forward compatible) So Perhaps worth losing the 5.7 server?
  22. How people build software # 24 # ! ! !

    ! 5.6 5.6 5.7 5.7 Promotion constraints But if most of your servers are 5.7, and 5.7 turns to be most up to date, better promote 5.7 and drop the 5.6 Orchestrator handles this logic and prioritizes promotion candidates by overall count and state of replicas
  23. How people build software # 25 # ! ! !

    ! Promotion constraints,
 real life! most up to date,
 DC2 less up to date, 
 DC1 no binary logs, 
 DC1 DC1 Orchestrator can promote one, non-ideal replica, have the rest of the replicas converge, and then refactor again, promoting an ideal server
  24. How people build software # 26 # Ways to avoid

    promotion constraints mess Make sure first replication tier is consistent,
 Have variety on 2nd tier ! ! ! ! ! ! ! 5.6 5.7 5.7 5.6 5.6 5.6
  25. How people build software # 27 # ! ! !

    ! 5.6 5.6, semi-sync 5.7 5.7 Use semi-sync on designated servers. They will be most up-to-date upon failure Ways to avoid promotion constraints mess
  26. How people build software # 28 # ! ! !

    ! 5.6 5.6 5.7 5.7 Solve the problem by aligning relay logs on 
 the replicas upon master failure. • That’s what MHA does • Work In Progress: Orchestrator to support this!
 Will require passwordless SSH Ways to avoid promotion 
 constraints mess %%%%
 %%%%
 %%%% %%%%
 %%%%
 %%%% %%%%
 %%%%
 %%%%
  27. How people build software # 29 # ! ! !

    ! ! ! ! Limbos Master failed; one replica lost along. Recovery went well. What happens when master is back alive?
  28. How people build software # 30 # ! ! !

    ! Limbos What will promoted master say? What will lost replica say? What will lost master say? OHAI ! Give me traffic ! VIP is mine! Also, good for traffic!
  29. How people build software # 31 # ! ! !

    ! Solving limbos • Orchestrator forcibly breaks 
 replication on lost replica • RESET SLAVE ALL or forced detach master on promoted replica • read_only=1 on old master, if possible • iptables on old master Master_host:
 //old.master.com Can’t find coordinates! Read only!
  30. How people build software # 32 # ! ! !

    ! ! ! ! DC split brain DC1 DC2 You’re dead! I can’t hear you! You’re dead! " " They’re dead! They’re dead!
  31. How people build software # Flapping & rolling failovers •

    The master is diagnosed as being dead • A new master is promoted • Turns out some app client is killing it • Rolling failover • What does happen to a dead master that comes back alive? 34 #
  32. How people build software # Flapping & rolling failovers •

    Orchestrator sets a minimal interval between two automated failovers • First one is automated; an immediate one following gets blocked • A human acknowledging the first failover implicitly resets. Good to go for next automated failover. • And a human can always command a failover. 35 #
  33. How people build software # Flapping & rolling failovers •

    Orchestrator marks a failed master as downtimed • Even if said server is back in the game (human intervention), this particular server will not be failed over in the duration of the downtime. • A human can terminate the downtime 36 #
  34. How people build software # 37 # ! ! !

    ! ! ! ! Recap: how orchestrator performs master failover • Detection: everyone agrees the master is dead • Is this incident muted? • Has this cluster just recently recovered from another failure without ack?
  35. How people build software # 38 # ! ! !

    ! ! ! ! Recap: how orchestrator performs master failover • Pick most up to date replica which will also make for least lost servers
 (the two are not necessarily the same) most up to date
  36. How people build software # ! 39 # ! !

    ! ! ! ! Recap: how orchestrator performs master failover • Refactor topology • Oh wait, 
 actually, now that everything’s connected, is there a better server to promote? • Go for it, refactor again • Mark old master as downtimed • Detach promoted master from old master
  37. How people build software # ! 40 # ! !

    ! Recap: how orchestrator performs master failover • Invoke external hooks • Orchestrator does not use nor imply a specific service discovery technique • Your own app/scripts to change VIP/ CNAME/Zk entries/Proxy/whatever
  38. How people build software # Visibility & control • Flapping

    and rolling failovers are avoided by having memory of past/recent events • Orchestrator audits: • Detection • Recoveries • Refactoring operations (alas without context) • Owners, reasons, internal operations… • To audit table; to orchestrator log; to syslog • Audit log available via API 41 #
  39. How people build software # Visibility & control • Control

    via: • Web interface • Web API • Command line interface • Hubot
 .orc sup
 > No incidents which require a failover to report.
 .orc recover failed.server.com
 .orc ack failed-cluster
 .orc relocate this.replica below that.one
 .orc graceful-takeover my-cluster 42 #
  40. How people build software # # Configuration vs. State based

    recoveries 43 $ • You designate specific roles to specific servers
 i.e. this server will have to be promoted 
 or these are the relevant servers, these are not
 • You must then match your operations to those dictated rules.
 • Any change you make (provision, deprovision, relocate, …) 
 must be reflected in configuration • Implies chef/puppet deploy; reload of services In configuration based recoveries:
  41. How people build software # # Configuration vs. State based

    recoveries 44 % • You trust the tooling to make the best of a situation
 • Basically do whatever a human would do
 • You still want to have roles for your servers • chef/puppet may still be involved • But those can be added/removed dynamically, 
 and the tooling adapts to change of state In state based recoveries:
  42. How people build software # Orchestrator’s detection reliability • There

    is no n-nines number • Orchestrator has proven to be very accurate, in production environments • Depending on both orchestrator & MySQL configuration, detection may take ~5-10 seconds 45 #
  43. How people build software # 46 # ! ! Orchestrator

    HA MYSQL PROXY LAYER HTTP PROXY LAYER Backend DB " " " "Leader Orchestrator services & Orchestrator is highly available • Supports multiple services competing for leadership • Requires highly available backend database. Supports master-master setup, and guarantees it to be collision free
  44. How people build software # Recent developments • Binary log

    indexing: makes for Pseudo-GTID matching within 1s-2s. Reduced recovery time • Planned master takeover, forced master takeover • Smarter promotion rules • Fuzzy names (it’s the simple stuff that makes life happier) • SSL (Square contributions) • Better master-master support • Replication structure analysis • MIT license! (thanks @Outbrain) 47 #
  45. How people build software # What’s on the roadmap? Ongoing,

    intended • Relay log alignment • Semi-sync (currently via contributions) Likely • Failure detection consensus / 
 leadership handover Maybe • orchestrator-agent xtrabackup Always • Reliability, performance, simplification 48 #
  46. How people build software # What’s on the roadmap? GitHub

    commitment to Orchestrator • We use it, we will make it better • Currently merging changes upstream • GitHub will become upstream • Better documentation, tutorials, sample public AMI • World domination Open and grateful for Contributions! Please discuss via Issues beforehand 49 #
  47. How people build software # Orchestrator/related talks • Choosing a

    MySQL HA solution today
 Michael Patrick (Percona)
 Tuesday 19, 5:15pm • Orchestrator at Square
 John Cesario, Grier Johnson, Brian Ip (Square)
 Thursday 21, 3:00pm 50 #
  48. How people build software # GitHub talks • Tutorial: MySQL

    GTID Implementation, Maintenance, and Best Practices
 Gillian Gunson (GitHub), Brian Cain (Dropbox), Mark Filipi (SurveyMonkey), Monday 18, 9:30am • Growing MySQL at GitHub
 Tom Krouper, Jonah Berquist
 Wednesday 20, 1:00pm • Rookie DBA Mistakes: How I Screwed Up So You Don't Have To
 Gillian Gunson
 Thursday 21, 12:50pm • Co-speaking: Dirty Little Secrets
 Jonah Berquist, Shlomi Noach
 Thursday 21, 3:00pm 51 #