Orchestrator High Availability tutorial

by Shlomi Noach

Slide 1

Slide 1 text

Orchestrator High Availability tutorial Shlomi Noach GitHub PerconaLive 2018

Slide 2

Slide 2 text

About me @github/database-infrastructure Author of orchestrator, gh-ost, freno, ccql and others. Blog at http://openark.org @ShlomiNoach

Slide 3

Slide 3 text

Agenda • Introduction to orchestrator • Basic configuration • Reliable detection considerations • Successful failover considerations • orchestrator failovers • Failover meta • orchestrator/raft HA • Master discovery approaches

Slide 4

Slide 4 text

GitHub Largest open source hosting 67M repositories, 24M users Critical path in build flows Best octocat T-Shirts and stickers

Slide 5

Slide 5 text

MySQL at GitHub Stores all the metadata: users, repositories,   commits, comments, issues, pull requests, … Serves web, API and auth traffic MySQL 5.7, semi-sync replication, RBR, cross DC ~15 TB of MySQL tables ~150 production servers, ~15 clusters Availability is critical

Slide 6

Slide 6 text

orchestrator, meta Adopted, maintained & supported by GitHub,   github.com/github/orchestrator Previously at Outbrain and Booking.com Orchestrator is free and open source, released under the Apache 2.0 license  github.com/github/orchestrator/releases ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 7

Slide 7 text

orchestrator Discovery  Probe, read instances, build topology graph, attributes, queries Refactoring  Relocate replicas, manipulate, detach, reorganize Recovery  Analyze, detect crash scenarios, structure warnings, failovers, promotions, acknowledgements, flap control, downtime, hooks ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 8

Slide 8 text

orchestrator/raft A highly available orchestrator setup Self healing Cross DC Mitigates DC partitioning ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 9

Slide 9 text

orchestrator/raft/sqlite Self contained orchestrator setup No MySQL backend Lightweight deployment Kubernetes friendly ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 10

Slide 10 text

orchestrator @ GitHub orchestrator/raft deployed on 3 DCs Automated failover for masters and intermediate masters Chatops integration Recently instated a orchestrator/consul/proxy setup for HA and master discovery !

Slide 11

Slide 11 text

Configuration for: Backend Probing/discovering MySQL topologies ! Setting up

Slide 12

Slide 12 text

"Debug": true,  "ListenAddress": ":3000",    ! Basic conﬁguration https://github.com/github/orchestrator/blob/master/docs/configuration-backend.md

Slide 13

Slide 13 text

"BackendDB": "sqlite",  "SQLite3DataFile": "/var/lib/orchestrator/ orchestrator.db", ! Basic conﬁguration, SQLite https://github.com/github/orchestrator/blob/master/docs/configuration-backend.md

Slide 14

Slide 14 text

"MySQLOrchestratorHost": "127.0.0.1",  "MySQLOrchestratorPort": 3306,  "MySQLOrchestratorDatabase": "orchestrator",    "MySQLTopologyCredentialsConfigFile":   “/etc/mysql/my.orchestrator.cnf“, ! Basic conﬁguration, MySQL https://github.com/github/orchestrator/blob/master/docs/configuration-backend.md

Slide 15

Slide 15 text

"MySQLTopologyUser": "orc_client_user",  "MySQLTopologyPassword": "123456",    "DiscoverByShowSlaveHosts": true,  "InstancePollSeconds": 5,    “HostnameResolveMethod": "default",  "MySQLHostnameResolveMethod": "@@report_host", ! Discovery conﬁguration, local https://github.com/github/orchestrator/blob/master/docs/configuration-discovery-basic.md  https://github.com/github/orchestrator/blob/master/docs/configuration-discovery-resolve.md

Slide 16

Slide 16 text

“MySQLTopologyCredentialsConfigFile": “/etc/mysql/ my.orchestrator-backend.cnf”,    "DiscoverByShowSlaveHosts": false,  "InstancePollSeconds": 5,    “HostnameResolveMethod": "default",  "MySQLHostnameResolveMethod": "@@hostname", ! Discovery conﬁguration, prod https://github.com/github/orchestrator/blob/master/docs/configuration-discovery-basic.md  https://github.com/github/orchestrator/blob/master/docs/configuration-discovery-resolve.md

Slide 17

Slide 17 text

"ReplicationLagQuery": "select   absolute_lag from meta.heartbeat_view",    "DetectClusterAliasQuery": "select   ifnull(max(cluster_name), '') as cluster_alias   from meta.cluster where anchor=1",    "DetectDataCenterQuery": "select   substring_index(  substring_index(@@hostname, '-',3),   '-', -1) as dc", ! Discovery/probe conﬁguration https://github.com/github/orchestrator/blob/master/docs/configuration-discovery-classifying.md

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Detection & recovery primer What’s so complicated about detection & recovery? How is orchestrator different than other solutions? What makes a reliable detection? What makes a successful recovery? Which parts of the recovery does orchestrator own? What about the parts it doesn’t own? !

Slide 20

Slide 20 text

Detection Runs at all times !

Slide 21

Slide 21 text

Some tools: dead master detection Common failover tools only observe per-server health. If the master cannot be reached, it is considered to be dead. To avoid false positives, some introduce repetitive checks + intervals. e.g. check every 5 seconds and if seen dead for 4 consecutive times, declare “death” This heuristically reduces false positives, and introduces recovery latency. ! !

Slide 22

Slide 22 text

Detection orchestrator continuously probes all MySQL topology servers At time of crash, orchestrator knows what the topology should look like, because it knows how it looked like a moment ago What insights can orchestrator draw from this fact? ! ! ! ! !

Slide 23

Slide 23 text

Detection: dead master,   holistic approach orchestrator uses a holistic approach. It harnesses the topology itself. orchestrator observes the master and the replicas. If the master is unreachable, but all replicas are happy, then there’s no failure. It may be a network glitch. ! ! ! ! !

Slide 24

Slide 24 text

Detection: dead master,   holistic approach If the master is unreachable, and all of the replicas are in agreement (replication broken), then declare “death”. There is no need for repetitive checks. Replication broke on all replicas due to a reason, and following its own timeout. ! ! ! ! !

Slide 25

Slide 25 text

Detection:   dead intermediate master orchestrator uses exact same holistic approach logic If intermediate master is unreachable and its replicas are broken, then declare “death” ! ! ! ! ! ! !

Slide 26

Slide 26 text

Detection: holistic approach False positives extremely low Some cases left for humans to handle ! ! ! ! !

Slide 27

Slide 27 text

Faster detection: MySQL conﬁg set global slave_net_timeout = 4; Implies: master_heartbeat_period = 2 ! ! ! ! !

Slide 28

Slide 28 text

Faster detection: MySQL conﬁg change master to  MASTER_CONNECT_RETRY = 1  MASTER_RETRY_COUNT = 86400 ! ! ! ! !

Slide 29

Slide 29 text

Detection: DC fencing orchestrator/raft detects and responds to DC fencing (DC network isolation) ! ! ! ! ! ! ! ! ! ! ! ! DC1 DC2 DC3

Slide 30

Slide 30 text

Detection: DC fencing Assume this 3 DC setup: One orchestrator node in each DC, Master and a few replicas in DC2. What happens if DC2 gets network partitioned? i.e. no network in or out DC2 ! ! ! ! ! ! ! ! ! ! ! ! DC1 DC2 DC3

Slide 31

Slide 31 text

Detection: DC fencing From the point of view of DC2 servers, and in particular in the point of view of DC2’s orchestrator node: Master and replicas are fine. DC1 and DC3 servers are all dead. No need for fail over. However, DC2’s orchestrator is not part of a quorum, hence not the leader. It doesn’t call the shots. ! ! ! ! ! ! ! ! ! ! ! ! DC1 DC2 DC3

Slide 32

Slide 32 text

Detection: DC fencing In the eyes of either DC1’s or DC3’s orchestrator: All DC2 servers, including the master, are dead. There is need for failover. DC1’s and DC3’s orchestrator nodes form a quorum. One of them will become the leader. The leader will initiate failover. ! ! ! ! ! ! ! ! ! ! ! ! DC1 DC2 DC3

Slide 33

Slide 33 text

Detection: DC fencing Depicted potential failover result. New master is from DC3. ! ! ! ! ! ! ! ! ! ! ! ! DC1 DC2 DC3

Slide 34

Slide 34 text

Recovery & promotion constraints You’ve made the decision to promote a new master Which one? Are all options valid? Is the current state what you think the current state is? !

Slide 35

Slide 35 text

Promote the most up-to-date replica An anti-pattern ! Recovery & promotion constraints

Slide 36

Slide 36 text

You wish to promote the most up to date replica, otherwise you give up on any replica that is more advanced Promotion constraints ! ! ! ! most up to date less up to date delayed 24 hours

Slide 37

Slide 37 text

You must not promote a replica that has no binary logs, or without log_slave_updates Promotion constraints ! ! ! ! log_slave_updates log_slave_updates no binary logs

Slide 38

Slide 38 text

You prefer to promote a replica from same DC as failed master Promotion constraints ! ! ! ! DC1 DC1 DC2 DC1

Slide 39

Slide 39 text

You must not promote Row Based Replication server on top of Statement Based Replication Promotion constraints ! ! ! ! SBR SBR RBR SBR

Slide 40

Slide 40 text

Promoting 5.7 means losing 5.6 (replication not forward compatible) So Perhaps worth losing the 5.7 server? Promotion constraints ! ! ! ! 5.6 5.6 5.7 5.6

Slide 41

Slide 41 text

But if most of your servers are 5.7, and 5.7 turns to be most up to date, better promote 5.7 and drop the 5.6 Orchestrator handles this logic and prioritizes promotion candidates by overall count and state of replicas Promotion constraints ! ! ! ! 5.6 5.7 5.7 5.6

Slide 42

Slide 42 text

Orchestrator can promote one, non-ideal replica, have the rest of the replicas converge,     and then refactor again, promoting an ideal server. Promotion constraints: real life ! ! ! ! most up-to-date  DC2 less up-to-date  DC1 No binary logs  DC1 DC1

Slide 43

Slide 43 text

Other tools:  MHA Avoids the problem by syncing relay logs. Identity of replica-to-promote dictated by config. No state-based resolution. ! ! ! ! !

Slide 44

Slide 44 text

Other tools:  replication-manager Potentially uses flashback, unapplying binlog events. This works on MariaDB servers.  https://www.percona.com/blog/2018/04/12/point-in-time-recovery-pitr-in-mysql-mariadb-percona-server/ No state-based resolution. ! ! ! ! !

Slide 45

Slide 45 text

More on the complexity of choosing a recovery path: http://code.openark.org/blog/mysql/whats-so-complicated-about-a-master-failover ! Recovery & promotion constraints

Slide 46

Slide 46 text

Flapping Acknowledgements Audit Downtime Promotion rules ! Recovery, meta

Slide 47

Slide 47 text

"RecoveryPeriodBlockSeconds": 3600, Sets minimal period between two automated recoveries on same cluster. Avoid server exhaustion on grand disasters. A human may acknowledge. ! Recovery, ﬂapping

Slide 48

Slide 48 text

$ orchestrator-client -c ack-cluster-recoveries   -alias mycluster -reason “testing” $ orchestrator-client -c ack-cluster-recoveries   -i instance.in.cluster.com -reason “fixed it” $ orchestrator-client -c ack-all-recoveries   -reason “I know what I’m doing” ! Recovery, acknowledgements

Slide 49

Slide 49 text

/web/audit-failure-detection /web/audit-recovery /web/audit-recovery/alias/mycluster /web/audit-recovery-steps/ 1520857841754368804:73fdd23f0415dc3f96f57dd4 c32d2d1d8ff829572428c7be3e796aec895e2ba1 ! Recovery, audit

Slide 50

Slide 50 text

/api/audit-failure-detection /api/audit-recovery /api/audit-recovery/alias/mycluster /api/audit-recovery-steps/ 1520857841754368804:73fdd23f0415dc3f96f57dd4 c32d2d1d8ff829572428c7be3e796aec895e2ba1 ! Recovery, audit

Slide 51

Slide 51 text

$ orchestrator-client -c begin-downtime   -i my.instance.com   -duration 30m -reason "experimenting" orchestrator will not auto-failover downtimed servers ! Recovery, downtime

Slide 52

Slide 52 text

On automated failovers, orchestrator will mark dead or lost servers as downtimed. Reason is set to lost-in-recovery. ! Recovery, downtime

Slide 53

Slide 53 text

orchestrator takes a dynamic approach as opposed to a configuration approach. You may have “preferred” replicas to promote. You may have replicas you don’t want to promote. You may indicate those to orchestrator dynamically, and/or change your mind, without touching configuration. Works well with puppet/chef/ansible. ! Recovery, promotion rules

Slide 54

Slide 54 text

$ orchestrator-client -c register-candidate  -i my.instance.com   -promotion-rule=prefer Options are: • prefer • neutral • prefer_not • must_not ! Recovery, promotion rules

Slide 55

Slide 55 text

• prefer  If possible, promote this server • neutral • prefer_not  Can be used in two-step promotion • must_not  Dirty, do not even use Examples: we set prefer for servers with better raid setup. prefer_not for backup servers or servers loaded with other tasks. must_not for gh-ost testing servers ! Recovery, promotion rules

Slide 56

Slide 56 text

orchestrator supports: Automated master & intermediate master failovers Manual master & intermediate master failovers per detection Graceful (manual, planned) master takeovers Panic (user initiated) master failovers ! Failovers

Slide 57

Slide 57 text

"RecoverMasterClusterFilters": [  “opt-in-cluster“,  “another-cluster”  ], "RecoverIntermediateMasterClusterFilters": [  "*"  ], ! Failover conﬁguration

Slide 58

Slide 58 text

"ApplyMySQLPromotionAfterMasterFailover": true,  "MasterFailoverLostInstancesDowntimeMinutes": 10,  "FailMasterPromotionIfSQLThreadNotUpToDate": true,  "DetachLostReplicasAfterMasterFailover": true, Special note for ApplyMySQLPromotionAfterMasterFailover: RESET SLAVE ALL  SET GLOBAL read_only = 0 ! Failover conﬁguration

Slide 59

Slide 59 text

"PreGracefulTakeoverProcesses": [],  "PreFailoverProcesses": [  "echo 'Will recover from {failureType} on {failureCluster}’ >> /tmp/recovery.log"  ], "PostFailoverProcesses": [  "echo '(for all types) Recovered from {failureType} on {failureCluster}.   Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}'   >> /tmp/recovery.log"  ],  "PostUnsuccessfulFailoverProcesses": [],  "PostMasterFailoverProcesses": [  "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:  {failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log"  ],  "PostIntermediateMasterFailoverProcesses": [],  "PostGracefulTakeoverProcesses": [], Failover conﬁguration

Slide 60

Slide 60 text

! $1M Question What do you use for your pre/post failover hooks? To be discussed and demonstrated shortly.

Slide 61

Slide 61 text

"KVClusterMasterPrefix": "mysql/master",  "ConsulAddress": "127.0.0.1:8500",  "ZkAddress": "srv-a,srv-b:12181,srv-c", ZooKeeper not implemented yet (v3.0.10) orchestrator updates KV stores at each failover ! KV conﬁguration

Slide 62

Slide 62 text

$ consul kv get -recurse mysql mysql/master/orchestrator-ha:my.instance-13ff.com:3306  mysql/master/orchestrator-ha/hostname:my.instance-13ff.com  mysql/master/orchestrator-ha/ipv4:10.20.30.40  mysql/master/orchestrator-ha/ipv6:  mysql/master/orchestrator-ha/port:3306 KV writes successive, non atomic. ! KV contents

Slide 63

Slide 63 text

Assuming orchestrator agrees there’s a problem: orchestrator-client -c recover -i failed.instance.com or via web, or via API /api/recover/failed.instance.com/3306 ! Manual failovers

Slide 64

Slide 64 text

Initiate a graceful failover. Sets read_only/super_read_only on master, promotes replica once caught up. orchestrator-client -c graceful-master-takeover   -alias mycluster or via web, or via API. See PreGracefulTakeoverProcesses, PostGracefulTakeoverProcesses config. ! Graceful (planned)   master takeover

Slide 65

Slide 65 text

Even if orchestrator disagrees there’s a problem: orchestrator-client -c force-master-failover   -alias mycluster or via API. Forces orchestrator to initiate a failover as if the master is dead. ! Panic (human operated)   master failover

Slide 66

Slide 66 text

! ! ! ! ! ! ! ! ! " Master discovery How do applications know which MySQL server is the master? How do applications learn about master failover? " !

Slide 67

Slide 67 text

Master discovery The answer dictates your HA strategy and capabilities. !

Slide 68

Slide 68 text

Master discovery methods Hard code IPs, DNS/VIP, Service Discovery, Proxy, combinations of the above !

Slide 69

Slide 69 text

Master discovery via hard coded  IP address e.g. committing identity of master in config/yml file and distributing via chef/puppet/ansible Cons: Slow to deploy Using code for state !

Slide 70

Slide 70 text

Master discovery via DNS Pros: No changes to the app which only knows about the host Name/CNAME Cross DC/Zone Cons: TTL Shipping the change to all DNS servers Connections to old master potentially uninterrupted !

Slide 71

Slide 71 text

" " " ! ! ! ! ! ! ! ! ! ! ! ! ! ! DNS DNS app ! ! ! orchestrator Master discovery via DNS

Slide 72

Slide 72 text

Master discovery via DNS "ApplyMySQLPromotionAfterMasterFailover": true,  "PostMasterFailoverProcesses": [  "/do/what/you/gotta/do to apply dns change for {failureClusterAlias}-writer.example.net to {successorHost}"  ], !

Slide 73

Slide 73 text

Master discovery via VIP Pros: No changes to the app which only knows about the VIP Cons: Cooperative assumption Remote SSH / Remote exec Sequential execution: only grab VIP after old master gave it away. Constrained to physical boundaries. DC/Zone bound. !

Slide 74

Slide 74 text

" " " ! ! ! ! ! ! ! ! ! ! ! ! ! ! app ⋆ ⋆ ⋆ ! ! ! orchestrator Master discovery via VIP

Slide 75

Slide 75 text

Master discovery via VIP "ApplyMySQLPromotionAfterMasterFailover": true,  "PostMasterFailoverProcesses": [  "ssh {failedHost} 'sudo ifconfig the-vip-interface down'",  "ssh {successorHost} 'sudo ifconfig the-vip-interface up'",  "/do/what/you/gotta/do to apply dns change for {failureClusterAlias}-writer.example.net to {successorHost}"  ], !

Slide 76

Slide 76 text

Master discovery via VIP+DNS Pros: Fast on inter DC/Zone Cons: TTL on cross DC/Zone Shipping the change to all DNS servers Connections to old master potentially uninterrupted Slightly more complex logic !

Slide 77

Slide 77 text

" " " ! ! ! ! ! ! ! ! ! ! ! ! ! ! app ⋆ ⋆ ⋆ DNS DNS ! ! ! orchestrator Master discovery via VIP+DNS

Slide 78

Slide 78 text

Master discovery   via service discovery, client based e.g. ZooKeeper is source of truth, all clients poll/listen on Zk Cons: Distribute the change cross DC Responsibility of clients to disconnect from old master Client overload How to verify all clients are up-to-date Pros: (continued) !

Slide 79

Slide 79 text

Master discovery   via service discovery, client based e.g. ZooKeeper is source of truth, all clients poll/listen on Zk Pros: No geographical constraints Reliable components !

Slide 80

Slide 80 text

" " " ! ! ! ! ! ! ! ! ! ! ! ! ! ! app $ Service  discovery $ Service  discovery ! ! ! Master discovery via service discovery, client based orchestrator/  raft

Slide 81

Slide 81 text

Master discovery   via service discovery, client based "ApplyMySQLPromotionAfterMasterFailover": true,  "PostMasterFailoverProcesses": [  “/just/let/me/know about failover on {failureCluster}“,  ],  "KVClusterMasterPrefix": "mysql/master",  "ConsulAddress": "127.0.0.1:8500",  "ZkAddress": "srv-a,srv-b:12181,srv-c",  ! ZooKeeper not implemented yet (v3.0.10)

Slide 82

Slide 82 text

Master discovery   via service discovery, client based "RaftEnabled": true, "RaftDataDir": "/var/lib/orchestrator", "RaftBind": "node-full-hostname-2.here.com", "DefaultRaftPort": 10008, "RaftNodes": [ "node-full-hostname-1.here.com", "node-full-hostname-2.here.com", "node-full-hostname-3.here.com" ], ! Cross-DC local KV store updates via raft    ZooKeeper not implemented yet (v3.0.10)

Slide 83

Slide 83 text

Master discovery   via proxy heuristic Proxy to pick writer based on read_only = 0 Cons: An Anti-pattern. Do not use this method. Reasonable risk for split brain, two active masters. Pros: Very simple to set up, hence its appeal. !

Slide 84

Slide 84 text

Master discovery via proxy heuristic % % % ! ! ! ! ! ! ! ! ! ! ! ! ! ! proxy " " " ! ! ! app orchestrator read_only=0

Slide 85

Slide 85 text

Master discovery via proxy heuristic % % % ! ! ! ! ! ! ! ! ! ! ! ! ! ! proxy " " " ! ! ! app orchestrator & read_only=0 read_only=0

Slide 86

Slide 86 text

Master discovery   via proxy heuristic ! "ApplyMySQLPromotionAfterMasterFailover": true,  "PostMasterFailoverProcesses": [  “/just/let/me/know about failover on {failureCluster}“,  ],  An Anti-pattern. Do not use this method. Reasonable risk for split brain, two active masters.

Slide 87

Slide 87 text

Master discovery   via service discovery & proxy e.g. Consul authoritative on current master identity, consul-template runs on proxy, updates proxy config based on Consul data Cons: Distribute changes cross DC Proxy HA? Pros: (continued) !

Slide 88

Slide 88 text

Master discovery   via service discovery & proxy Pros: No geographical constraints Decoupling failvoer logic from master discovery logic Well known, highly available components No changes to the app Can hard-kill connections to old master !

Slide 89

Slide 89 text

Master discovery   via service discovery & proxy Used at GitHub orchestrator fails over, updates Consul orchestrator/raft deployed on all DCs. Upon failover, each orchestrator/raft node updates local Consul setup. consul-template runs on GLB (redundant HAProxy array), reconfigured + reloads GLB upon master identity change App connects to GLB/Haproxy, gets routed to master !

Slide 90

Slide 90 text

orchestrator/Consul/GLB(HAProxy) @ GitHub % % % ! ! ! ! ! ! ! ! ! ! ! ! ! ! glb/proxy $ Consul * n " " " $ Consul * n ! ! ! app orchestrator/  raft

Slide 91

Slide 91 text

orchestrator/Consul/GLB(HAProxy), simpliﬁed % ! ! ! ! ! ! ! ! ! ! ! ! ! ! " $ Consul * n glb/proxy orchestrator/raft

Slide 92

Slide 92 text

Master discovery   via service discovery & proxy "ApplyMySQLPromotionAfterMasterFailover": true,  "PostMasterFailoverProcesses": [  “/just/let/me/know about failover on {failureCluster}“,  ],  "KVClusterMasterPrefix": "mysql/master",  "ConsulAddress": "127.0.0.1:8500",  "ZkAddress": "srv-a,srv-b:12181,srv-c",  ! ZooKeeper not implemented yet (v3.0.10)

Slide 93

Slide 93 text

Master discovery   via service discovery & proxy "RaftEnabled": true, "RaftDataDir": "/var/lib/orchestrator", "RaftBind": "node-full-hostname-2.here.com", "DefaultRaftPort": 10008, "RaftNodes": [ "node-full-hostname-1.here.com", "node-full-hostname-2.here.com", "node-full-hostname-3.here.com" ], ! Cross-DC local KV store updates via raft    ZooKeeper not implemented yet (v3.0.10)

Slide 94

Slide 94 text

Master discovery   via service discovery & proxy Vitess’ master discovery works in similar manner: vtgate servers serve as proxy, consult with backend etcd/consul/zk for identity of cluster master. kubernetes works in similar manner. etcd lists roster for backend servers. See also: Automatic Failovers with Kubernetes using Orchestrator, ProxySQL and Zookeeper  Tue 15:50 - 16:40  Jordan Wheeler, Sami Ahlroos (Shopify)  https://www.percona.com/live/18/sessions/automatic-failovers-with-kubernetes-using-orchestrator-proxysql-and-zookeeper Orchestrating ProxySQL with Orchestrator and Consul  PerconaLive Dublin  Avraham Apelbaum (wix.COM)  https://www.percona.com/live/e17/sessions/orchestrating-proxysql-with-orchestrator-and-consul !

Slide 95

Slide 95 text

orchestrator HA What makes orchestrator itself highly available? !

Slide 96

Slide 96 text

orchestrator HA via Raft Concensus orchestrator/raft for out of the box HA. orchestrator nodes communicate via raft protocol. Leader election based on quorum. Raft replication log, snapshots. Node can leave, join back, catch up. https://github.com/github/orchestrator/blob/master/docs/deployment-raft.md ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 97

Slide 97 text

orchestrator HA via Raft Concensus "RaftEnabled": true, "RaftDataDir": "/var/lib/orchestrator", "RaftBind": "node-full-hostname-2.here.com", "DefaultRaftPort": 10008, "RaftNodes": [ "node-full-hostname-1.here.com", "node-full-hostname-2.here.com", "node-full-hostname-3.here.com" ], ! ! ! Config docs:  https://github.com/github/orchestrator/blob/master/docs/configuration-raft.md

Slide 98

Slide 98 text

orchestrator HA via Raft Concensus "RaftAdvertise": “node-external-ip-2.here.com“, “BackendDB": "sqlite", "SQLite3DataFile": "/var/lib/orchestrator/orchestrator.db", ! ! ! Config docs:  https://github.com/github/orchestrator/blob/master/docs/configuration-raft.md

Slide 99

Slide 99 text

orchestrator HA via shared backend DB As alternative to orchestrator/raft, use Galera/XtraDB Cluster/InnoDB Cluster as shared backend DB. 1:1 mapping between orchestrator nodes and DB nodes. Leader election via relational statements. https://github.com/github/orchestrator/blob/master/docs/deployment-shared- backend.md ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Slide 100

Slide 100 text

orchestrator HA via shared backend DB ! ! ! "MySQLOrchestratorHost": “127.0.0.1”, "MySQLOrchestratorPort": 3306, "MySQLOrchestratorDatabase": "orchestrator", "MySQLOrchestratorCredentialsConfigFile": “/etc/mysql/ orchestrator-backend.cnf", Config docs:  https://github.com/github/orchestrator/blob/master/docs/configuration-backend.md

Slide 101

Slide 101 text

orchestrator HA via shared backend DB ! ! ! $ cat /etc/mysql/orchestrator-backend.cnf [client] user=orchestrator_srv password=${ORCHESTRATOR_PASSWORD} Config docs:  https://github.com/github/orchestrator/blob/master/docs/configuration-backend.md

Slide 102

Slide 102 text

Ongoing investment in orchestrator/raft. orchestrator owns its own HA. Synchronous replication backend owned and operated by the user, not by orchestrator Comparison of the two approaches:  https://github.com/github/orchestrator/blob/master/docs/raft-vs-sync-repl.md Other approaches are Master-Master replication or standard replication backend. Owned and operated by the user, not by orchestrator. ! orchestrator HA approaches

Slide 103

Slide 103 text

Oracle MySQL, Percona Server, MariaDB GTID (Oracle + MariaDB) Semi-sync, statement/mixed/row, parallel replication Master-master (2 node circular) replication SSL/TLS Consul, Graphite, MySQL/SQLite backend ! Supported

Slide 104

Slide 104 text

Galera/XtraDB Cluster InnoDB Cluster Multi source replication Tungsten 3+ nodes circular replication 5.6 parallel replication for Pseudo-GTID ! Not supported

Slide 105

Slide 105 text

orchestrator/raft makes for a good, cross DC highly available self sustained setup, Kubernetes friendly. Consider sqlite backend. Master discovery methods vary. Reduce hooks/friction by using a discovery service. ! Conclusions

Slide 106

Slide 106 text

Questions? github.com/shlomi-noach @ShlomiNoach Thank you!