Slide 1

Slide 1 text

Introduction to Introduction to Percona XtraDB Cluster Percona XtraDB Cluster and HAProxy and HAProxy 2014.04.12 2014.04.12 Bo-Yi Wu Bo-Yi Wu appleboy appleboy

Slide 2

Slide 2 text

2 2 About me About me Github: @appleboy Github: @appleboy Twitter: @appleboy Twitter: @appleboy Blog: http://blog.wu-boy.com Blog: http://blog.wu-boy.com

Slide 3

Slide 3 text

3 3 Agenda Agenda  About Percona XtraDB Cluster About Percona XtraDB Cluster  Install the first node of the cluster Install the first node of the cluster  Install subsequent nodes to the cluster Install subsequent nodes to the cluster  Install HAProxy on the application server Install HAProxy on the application server  Testing with a real-world application Testing with a real-world application

Slide 4

Slide 4 text

4 Why use Why use Percona XtraDB Cluster? Percona XtraDB Cluster?

Slide 5

Slide 5 text

5 MySQL Replication MySQL Replication vs vs Percona XtraDB Cluster Percona XtraDB Cluster

Slide 6

Slide 6 text

6 Async vs Sync Async vs Sync

Slide 7

Slide 7 text

7 7 MySQL Replication: MySQL Replication: Async Async 1...10...sec delay

Slide 8

Slide 8 text

8 8 MySQL Replication: MySQL Replication: Async Async

Slide 9

Slide 9 text

9 9 sync sync Event Event confirm

Slide 10

Slide 10 text

10 Percona XtraDB Cluster Percona XtraDB Cluster Free and Open Source Free and Open Source

Slide 11

Slide 11 text

11 11 Percona XtraDB Cluster Percona XtraDB Cluster Group Communication

Slide 12

Slide 12 text

12 12 Percona XtraDB Cluster Percona XtraDB Cluster  Synchronous replication Synchronous replication  Multi-master replication Multi-master replication  Parallel applying on slaves Parallel applying on slaves  Data consistency Data consistency  Automatic node provisioning Automatic node provisioning

Slide 13

Slide 13 text

13 Synchronous Synchronous replication replication

Slide 14

Slide 14 text

14 14 Virtually synchronous Virtually synchronous

Slide 15

Slide 15 text

15 Multi-master replication Multi-master replication

Slide 16

Slide 16 text

16 16 Multi-master: MySQL Multi-master: MySQL MySQL Replication Write Fail

Slide 17

Slide 17 text

17 17 Multi-master: XtraDB Cluster Multi-master: XtraDB Cluster XtraDB Cluster Write Write Write

Slide 18

Slide 18 text

18 Parallel applying on slaves Parallel applying on slaves

Slide 19

Slide 19 text

19 19 Parallel apply: MySQL Parallel apply: MySQL Write N threads Apply 1 thread

Slide 20

Slide 20 text

20 20 Write N threads Apply N thread Parallel apply: XtraDB Cluster Parallel apply: XtraDB Cluster

Slide 21

Slide 21 text

21 Data consistency Data consistency

Slide 22

Slide 22 text

22 22 XtraDB Cluster data consistency XtraDB Cluster data consistency == ==

Slide 23

Slide 23 text

23 Automatic node provisioning Automatic node provisioning

Slide 24

Slide 24 text

24 Group Communication Copy Data Join Cluster

Slide 25

Slide 25 text

25 How many nodes should I have? How many nodes should I have?

Slide 26

Slide 26 text

26 3 nodes is the minimal 3 nodes is the minimal recommended configuration recommended configuration >=3 nodes for quorum purpose

Slide 27

Slide 27 text

27 27 Network Failure Split brain 50% is not a quorum

Slide 28

Slide 28 text

28 Network Failure XtraDB Cluster: Data consistency

Slide 29

Slide 29 text

29 garbd garbd Galera Abitrator Daemon Galera Abitrator Daemon

Slide 30

Slide 30 text

30 Percona XtraDB Cluster Percona XtraDB Cluster Limitations Limitations

Slide 31

Slide 31 text

31 Only Support InnoDB Table Only Support InnoDB Table MyISAM support is limited MyISAM support is limited

Slide 32

Slide 32 text

32 write performance? write performance? limited by weakest node limited by weakest node

Slide 33

Slide 33 text

33 Joing Process Joing Process

Slide 34

Slide 34 text

34 Group Communication Copy Data Join Cluster SST 1TB take long time

Slide 35

Slide 35 text

35 35 State Transfer State Transfer  Full data SST Full data SST – New node New node – Node long time disconnected Node long time disconnected  Incremental IST Incremental IST – Node disconnected short time Node disconnected short time

Slide 36

Slide 36 text

36 36 Snapshot State Transfer Snapshot State Transfer  Mysqldump Mysqldump – Small databases Small databases  Rsync Rsync – Donor disconnected for copy time Donor disconnected for copy time – faster faster  XtraBackup XtraBackup – Donor disconnected for short time Donor disconnected for short time – slower slower

Slide 37

Slide 37 text

37 37 Incremental State Transfer Incremental State Transfer  Node was in cluster Node was in cluster – Disconnected for maintenance Disconnected for maintenance – Node Crashed Node Crashed

Slide 38

Slide 38 text

38 Install via Install via Percona's yum repository Percona's yum repository

Slide 39

Slide 39 text

39 $ yum -y install \ $ yum -y install \ Percona-XtraDB-Cluster-server \ Percona-XtraDB-Cluster-server \ Percona-XtraDB-Cluster-client \ Percona-XtraDB-Cluster-client \ Percona-Server-shared-compat \ Percona-Server-shared-compat \ percona-xtrabackup percona-xtrabackup

Slide 40

Slide 40 text

40 Configuring the nodes Configuring the nodes

Slide 41

Slide 41 text

41  wsrep_cluster_address=gcomm:// wsrep_cluster_address=gcomm:// – Initializes a new cluster for first node Initializes a new cluster for first node  wsrep_cluster_address=gcomm://, wsrep_cluster_address=gcomm://, , , – Default port: 4567 Default port: 4567

Slide 42

Slide 42 text

42 Don’t use wsrep_urls Don’t use wsrep_urls wsrep_urls is deprecated since version wsrep_urls is deprecated since version 5.5.28 5.5.28

Slide 43

Slide 43 text

43 43 Configuring the first node Configuring the first node  [mysqld] [mysqld]  wsrep_provider=/usr/lib64/libgalera_smm.so wsrep_provider=/usr/lib64/libgalera_smm.so  wsrep_cluster_address = " wsrep_cluster_address = "gcomm:// gcomm://" "  wsrep_sst_auth=username:password wsrep_sst_auth=username:password  wsrep_provider_options="gcache.size=4G" wsrep_provider_options="gcache.size=4G"  wsrep_cluster_name=Percona wsrep_cluster_name=Percona  wsrep_sst_method=xtrabackup wsrep_sst_method=xtrabackup  wsrep_node_name=db_01 wsrep_node_name=db_01  wsrep_slave_threads=4 wsrep_slave_threads=4  log_slave_updates log_slave_updates  innodb_locks_unsafe_for_binlog=1 innodb_locks_unsafe_for_binlog=1  innodb_autoinc_lock_mode=2 innodb_autoinc_lock_mode=2

Slide 44

Slide 44 text

44 44 Configuring subsequent nodes Configuring subsequent nodes  [mysqld] [mysqld]  wsrep_provider=/usr/lib64/libgalera_smm.so wsrep_provider=/usr/lib64/libgalera_smm.so  wsrep_cluster_address = " wsrep_cluster_address = "gcomm://xxxx,xxxx gcomm://xxxx,xxxx" "  wsrep_sst_auth=username:password wsrep_sst_auth=username:password  wsrep_provider_options="gcache.size=4G" wsrep_provider_options="gcache.size=4G"  wsrep_cluster_name=Percona wsrep_cluster_name=Percona  wsrep_sst_method=xtrabackup wsrep_sst_method=xtrabackup  wsrep_node_name=db_01 wsrep_node_name=db_01  wsrep_slave_threads=4 wsrep_slave_threads=4  log_slave_updates log_slave_updates  innodb_locks_unsafe_for_binlog=1 innodb_locks_unsafe_for_binlog=1  innodb_autoinc_lock_mode=2 innodb_autoinc_lock_mode=2

Slide 45

Slide 45 text

45 Monitoring MySQL Status Monitoring MySQL Status show global status like ' show global status like 'wsrep% wsrep%'; ';

Slide 46

Slide 46 text

46 46 Cluster integrity Cluster integrity  wsrep_cluster_size wsrep_cluster_size – Configuration version Configuration version  wsrep_conf_id wsrep_conf_id – Number of active nodes Number of active nodes  wsrep_cluster_status wsrep_cluster_status – Should be “Primary” Should be “Primary”

Slide 47

Slide 47 text

47 47 Node Status Node Status  wsrep_ready wsrep_ready – Should be “On” Should be “On”  wsrep_local_state_comment wsrep_local_state_comment – Status message Status message  wsep_local_send_q_avg wsep_local_send_q_avg – Possible network bottleneck Possible network bottleneck  wsrep_flow_control_paused wsrep_flow_control_paused – Replication lag Replication lag

Slide 48

Slide 48 text

48 Realtime Wsrep Status Realtime Wsrep Status https://github.com/jayjanssen/myq_gadgets

Slide 49

Slide 49 text

49 49 Realtime Wsrep Status Realtime Wsrep Status Percona / db_03 / Galera 2.8(r165) Percona / db_03 / Galera 2.8(r165) Wsrep Cluster Node Queue Ops Bytes Flow Conflct PApply Commit Wsrep Cluster Node Queue Ops Bytes Flow Conflct PApply Commit time P cnf # cmt sta Up Dn Up Dn Up Dn p_ms snt lcf bfa dst oooe oool wind time P cnf # cmt sta Up Dn Up Dn Up Dn p_ms snt lcf bfa dst oooe oool wind 11:47:39 P 73 3 Sync T/T 0 0 5 356 30K 149K 0.0 0 0 0 125 0 0 0 11:47:39 P 73 3 Sync T/T 0 0 5 356 30K 149K 0.0 0 0 0 125 0 0 0 11:47:40 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:40 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:41 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:41 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:42 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:42 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:43 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:43 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:44 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:44 P 73 3 Sync T/T 0 0 0 0 0 0 0.0 0 0 0 125 0 0 0 11:47:45 P 73 3 Sync T/T 0 0 0 3 0 1.1K 0.0 0 0 0 126 67 0 1 11:47:45 P 73 3 Sync T/T 0 0 0 3 0 1.1K 0.0 0 0 0 126 67 0 1 11:47:46 P 73 3 Sync T/T 0 0 0 2 0 994 0.0 0 0 0 126 0 0 0 11:47:46 P 73 3 Sync T/T 0 0 0 2 0 994 0.0 0 0 0 126 0 0 0 ./myq_status -t 1 -h 127.0.0.1 wsrep

Slide 50

Slide 50 text

50 Application / Cluster Application / Cluster

Slide 51

Slide 51 text

51 How Synchronous How Synchronous Writes work Writes work

Slide 52

Slide 52 text

52 Source Node Source Node pessimistic locking pessimistic locking InnoDB transaction locking

Slide 53

Slide 53 text

53 53 Cluster replication Cluster replication  Before source return commits Before source return commits – Certify trx on all other nodes Certify trx on all other nodes  Nodes reject on locking conflicts Nodes reject on locking conflicts  Commit successfully if no conflicts on Commit successfully if no conflicts on any node any node

Slide 54

Slide 54 text

54 Node 1 Tx Source Node 2 Accepted Node 3 Certify Fails Client 2 Client 1 Update t set col = '12' where id = '1' Update t set col = '12' where id = '1' Update t set col = '12' where id = '1'

Slide 55

Slide 55 text

55 Application Care? Application Care?

Slide 56

Slide 56 text

56 Write to all nodes Write to all nodes Increase of deadlock errors Increase of deadlock errors

Slide 57

Slide 57 text

57 How to avoid deadlock How to avoid deadlock on all nodes? on all nodes?

Slide 58

Slide 58 text

58 58 How to avoid deadlock How to avoid deadlock  Writing to only one node Writing to only one node – All pessimistic locking happens on one node All pessimistic locking happens on one node  Different nodes can handle writes for Different nodes can handle writes for different datasets different datasets – Different database, tables, rows etc. Different database, tables, rows etc.

Slide 59

Slide 59 text

59 Application to cluster connects Application to cluster connects

Slide 60

Slide 60 text

60 60 Application to cluster Application to cluster  For writes For writes – Best practice: single node Best practice: single node  For reads For reads – All nodes load balanced All nodes load balanced  glbd – Galera Load Balancer glbd – Galera Load Balancer  Haproxy Haproxy

Slide 61

Slide 61 text

61 192.168.1.100 192.168.1.101 192.168.1.102 HAProxy Load Balancer HAProxy Load Balancer Read/Write Read Read

Slide 62

Slide 62 text

62 HAProxy Load balancing HAProxy Load balancing

Slide 63

Slide 63 text

63 Read and Write Read and Write on the same port on the same port

Slide 64

Slide 64 text

64  frontend pxc-front frontend pxc-front  bind *:3307 bind *:3307  mode tcp mode tcp  default_backend pxc-back default_backend pxc-back  backend pxc-back backend pxc-back  mode tcp mode tcp  balance leastconn balance leastconn  option httpchk option httpchk  server db1 192.168.1.100:3306 check port 9200 inter server db1 192.168.1.100:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3  server db2 192.168.1.101:3306 check port 9200 inter server db2 192.168.1.101:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3  server db3 192.168.1.102:3306 check port 9200 inter server db3 192.168.1.102:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3

Slide 65

Slide 65 text

65 Read and Write Read and Write on different port on different port

Slide 66

Slide 66 text

66  frontend pxc-onenode-front frontend pxc-onenode-front  bind *:3308 bind *:3308  mode tcp mode tcp  default_backend pxc-onenode-back default_backend pxc-onenode-back  backend pxc-onenode-back backend pxc-onenode-back  mode tcp mode tcp  balance leastconn balance leastconn  option httpchk option httpchk  server db1 192.168.1.100:3306 check port 9200 inter server db1 192.168.1.100:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3  server db2 192.168.1.101:3306 check port 9200 inter server db2 192.168.1.101:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3 backup backup  server db3 192.168.1.102:3306 check port 9200 inter server db3 192.168.1.102:3306 check port 9200 inter 12000 rise 3 fall 3 12000 rise 3 fall 3 backup backup

Slide 67

Slide 67 text

67 67 Application server Application server  CentOS 6 base installation CentOS 6 base installation  EPEL repo added EPEL repo added  HaProxy installed from EPEL repo HaProxy installed from EPEL repo  Sysbench 0.5 package Sysbench 0.5 package

Slide 68

Slide 68 text

68 Live Demo Live Demo

Slide 69

Slide 69 text

69 Thank you Thank you