Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Galera Cluster Introduction

Deimos Fr
August 01, 2014

Galera Cluster Introduction

This slides are talking about Galera Cluster. How it works, how to install and configure it for production usages. This is an introduction and basics concepts.

Deimos Fr

August 01, 2014
Tweet

More Decks by Deimos Fr

Other Decks in Technology

Transcript

  1. Galera Cluster Introduction: Summary Summary 1 Introduction . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Recover and Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Pierre Mavro www.enovance.com Galera Cluster Introduction 2 / 34
  2. Galera Cluster Introduction: Introduction Plan 1 Introduction . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Galera: Features, benefits and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Use cases and Common architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Pierre Mavro www.enovance.com Galera Cluster Introduction 3 / 34
  3. Galera Cluster Introduction: Introduction What is Galera Cluster ? Galera

    Cluster provides high system uptime with no data loss and scalability for the future growth. Galera is open-source product and we offer high quality support to help customers to increase service availability and lower total cost of ownership. Galera Cluster is a synchronous multi-master cluster for MariaDB and MySQL. It needs at least 3 nodes to work and exclusively works with InnoDB engine today (other should come in the future). http://www.codership.com Pierre Mavro www.enovance.com Galera Cluster Introduction 4 / 34
  4. Galera Cluster Introduction: Introduction Features Galera is synchronous multi-master cluster

    having features like: Synchronous replication Active-active multi-master topology Read and write to any cluster node Automatic membership control, failed nodes drop from the cluster Automatic node joining True parallel replication, on row level Direct client connections, native MySQL look & feel Pierre Mavro www.enovance.com Galera Cluster Introduction 5 / 34
  5. Galera Cluster Introduction: Introduction Benefits These features yield un-seen benefits

    for a DBMS clustering solution: No slave lag No lost transactions Both read and write scalability Smaller client latencies Pierre Mavro www.enovance.com Galera Cluster Introduction 6 / 34
  6. Galera Cluster Introduction: Introduction Limitations To have a correct database

    replication with Galera, some requirements/limitations exists : InnoDB engine only Primary keys on tables are required. Rows in tables without primary key may appear in different order on different nodes. Unsupported queries : LOCK/UNLOCK TABLES Lock functions (GET_LOCK(), RELEASE_LOCK()...) Query log cannot be directed to table XA transactions can not be supported due to possible rollback on commit Can’t limit transaction size Pierre Mavro www.enovance.com Galera Cluster Introduction 7 / 34
  7. Galera Cluster Introduction: Introduction Plan 1 Introduction . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Galera: Features, benefits and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Use cases and Common architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Pierre Mavro www.enovance.com Galera Cluster Introduction 8 / 34
  8. Galera Cluster Introduction: Introduction Use cases Galera replication works for

    a wide variety of use cases, here are some common use cases we have identified in the open source community: Read Master: Traditional MySQL master-slave topology, but with Galera all "slave" nodes are capable masters at all times, it is just the application who treats them as slaves. Galera replication can guarantee 0 slave lag for such installations and due to parallel slave applying, much better throughput for the cluster. Write Availability: Distributing writes across the cluster will harness the CPU power in slave nodes for better use to process client write transactions. Due to the row based replication method, only changes made during a client transaction will be replicated and applying such a transaction in slave applier is much faster than the processing of the original transaction. Therefore the cluster can distribute the heavy client transaction processing across many master nodes and this yields in better write transaction throughput overall. Pierre Mavro www.enovance.com Galera Cluster Introduction 9 / 34
  9. Galera Cluster Introduction: Introduction Use cases WAN Clustering: Synchronous replication

    works fine over the WAN network. There will be a delay, which is proportional to the network round trip time (RTT), but it only affects the commit operation. Disaster Recovery: Disaster recovery is a sub-class of WAN replication. Here one data center is passive and only receives replication events, but does not process any client transactions. Such a remote data center will be up to date at all times and no data loss can happen. During recovery, the spare site is just nominated as primary and application can continue as normal with a minimal fail over delay. Latency Eraser: With WAN replication topology, cluster nodes can be located close to clients. Therefore all read & write operations will be super fast with the local node connection. The RTT related delay will be experienced only at commit time, and even then it can be generally accepted by end user, usually the kill-joy for end user experiences is the slow browsing response time, and read operations are as fast as they possibly can be. Pierre Mavro www.enovance.com Galera Cluster Introduction 10 / 34
  10. Galera Cluster Introduction: Introduction Common Architecture Here is a classical

    case for a distributed solution : The load balancers located on each App servers can scale to X Galera servers There is no SPOF Client-server communication latencies are lower Pierre Mavro www.enovance.com Galera Cluster Introduction 11 / 34
  11. Galera Cluster Introduction: Installation and Configuration Plan 2 Installation and

    Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Pierre Mavro www.enovance.com Galera Cluster Introduction 12 / 34
  12. Galera Cluster Introduction: Installation and Configuration Repository and installation To

    install MariaDB and Galera Cluster, the simplest way is to use official MariaDB repository : Install dependencies # aptitude install python -software -properties And then add the repository with the key: Adding repository # apt -key adv --recv -keys --keyserver keyserver.ubuntu.com \ 0 xcbcb082a1bb943db # add -apt -repository ’deb http :// mirrors.linsrv.net/mariadb/repo \ /10.0/ debian wheezy main ’ Pierre Mavro www.enovance.com Galera Cluster Introduction 13 / 34
  13. Galera Cluster Introduction: Installation and Configuration Repository and installation You’re

    now ready to install Galera Cluster: Install dependencies # aptitude update # aptitude install mariadb -galera -server galera rsync openntpd Notes Openntpd is necessary to avoid replication problems. All servers should be at the same time ! Pierre Mavro www.enovance.com Galera Cluster Introduction 14 / 34
  14. Galera Cluster Introduction: Installation and Configuration Plan 2 Installation and

    Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Pierre Mavro www.enovance.com Galera Cluster Introduction 15 / 34
  15. Galera Cluster Introduction: Installation and Configuration MariaDB Configuration Before starting

    Galera configuration, you can take a look at InnoDB configuration to make Galera Cluster work properly. You may have already done it and if it’s the case, adapt to your configuration : /etc/mysql/my.cnf [mysqld] innodb_buffer_pool_size = 256M innodb_log_buffer_size = 8M innodb_log_file_size = 256M thread_concurrency = 64 innodb_thread_concurrency = 64 innodb_read_io_threads = 16 innodb_write_io_threads = 16 innodb_flush_log_at_trx_commit = 2 innodb_file_per_table = 1 innodb_open_files = 400 innodb_io_capacity = 600 innodb_lock_wait_timeout = 60 innodb_flush_method = O_DIRECT innodb_doublewrite = 0 innodb_additional_mem_pool_size = 20M innodb_buffer_pool_restore_at_startup = 500 innodb_file_per_table Pierre Mavro www.enovance.com Galera Cluster Introduction 16 / 34
  16. Galera Cluster Introduction: Installation and Configuration MariaDB Configuration To apply

    your configuration, you have to remove "ib_logfile*" if you changed the "innodb_log_file_size" : Restart MariaDB # service mysql stop # rm /var/lib/mysql/ ib_logfile* # service mysql start Pierre Mavro www.enovance.com Galera Cluster Introduction 17 / 34
  17. Galera Cluster Introduction: Installation and Configuration Plan 2 Installation and

    Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Pierre Mavro www.enovance.com Galera Cluster Introduction 18 / 34
  18. Galera Cluster Introduction: Installation and Configuration Galera Configuration For the

    Galera configuration, there is a dedicated file in mysql folder. This part is the required InnoDB configuration to make Galera Cluster working properly. Placing those InnoDB options in that file will override the ones present in my.cnf. This to avoid any future error if my.cnf file became to be changed: /etc/mysql/conf.d/mariadb.cnf [mysqld] binlog_format = ROW innodb_autoinc_lock_mode = 2 innodb_flush_log_at_trx_commit = 2 innodb_locks_unsafe_for_binlog = 1 Pierre Mavro www.enovance.com Galera Cluster Introduction 19 / 34
  19. Galera Cluster Introduction: Installation and Configuration Galera Configuration All configuration

    settings starting with "wsrep_*" belong to Galera Cluster. /etc/mysql/conf.d/mariadb.cnf [mysqld] wsrep_provider = /usr/lib/galera/libgalera_smm.so wsrep_cluster_name = ’mariadb_cluster’ wsrep_node_name = node1 wsrep_node_address = "10.0.0.1" wsrep_cluster_address = ’gcomm://10.0.0.1,10.0.0.2,10.0.0.3,10.0.0.4’ wsrep_retry_autocommit = 0 wsrep_sst_method = rsync wsrep_provider_options = ”gcache.size = 1G; gcache.name = /tmp/galera.cache” #wsrep_replication_myisam = 1 #wsrep_sst_receive_address = <x.x.x.x> #wsrep_notify_cmd = "script.sh" There are not that many options, however they are very important. Pierre Mavro www.enovance.com Galera Cluster Introduction 20 / 34
  20. Galera Cluster Introduction: Installation and Configuration Galera Configuration wsrep_cluster_name :

    set the cluster name (needed if you have multiple Galera Cluster in the same subnet) wsrep_node_name : the node name (as a rule, use the server hostname) wsrep_node_address : the list of all cluster nodes wsrep_retry_autocommit : in transaction fail case, retry to commit once more wsrep_sst_method : xtrabackup: this is a fast solution that minimise the blocking time of the source node (donor). This is in the majority case the most appropriate solution. rsync: this is the fastest solution but this will block the source node (donor) longer (could be problematic in WAN architecture if bandwidth is the bottleneck) mysqldump: slowest solution (avoid it) Pierre Mavro www.enovance.com Galera Cluster Introduction 21 / 34
  21. Galera Cluster Introduction: Installation and Configuration Galera Configuration wsrep_provider_options :

    provides other interesting options gcache.size: Galera cache size used for inter cluster transfer. Grow it on huge usage gcache.name: where to store this cache Pierre Mavro www.enovance.com Galera Cluster Introduction 22 / 34
  22. Galera Cluster Introduction: Initialization Plan 3 Initialization . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Initialize Galera Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Galera status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Pierre Mavro www.enovance.com Galera Cluster Introduction 23 / 34
  23. Galera Cluster Introduction: Initialization Initialize Galera Cluster The first thing

    to do is stop all your MariaDB instances and start only the first node like this: Initialize Galera Cluster # service mysql start --wsrep_cluster_address =’gcomm ://’ Indicating an empty gcomm will initialize a new cluster. DANGER Never initialize a new cluster on a running one! You may loose data Now start all other MariaDB services normally, they will integrate the cluster by themselves: Start MariaDB # service mysql start Pierre Mavro www.enovance.com Galera Cluster Introduction 24 / 34
  24. Galera Cluster Introduction: Initialization Plan 3 Initialization . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Initialize Galera Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Galera status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Pierre Mavro www.enovance.com Galera Cluster Introduction 25 / 34
  25. Galera Cluster Introduction: Initialization Galera status If you look at

    this sample output: Get Galera status MariaDB [( none)]> show status like ’wsrep_ %’; +----------------------------+-------------------------------------------+ | Variable_name | Value | +----------------------------+-------------------------------------------+ | wsrep_local_send_queue_avg | 0.000000 | | wsrep_local_recv_queue_avg | 0.000000 | | wsrep_flow_control_paused | 0.000000 | wsrep_local_state_comment | Synced | | wsrep_incoming_addresses | 10.0.0.1:3306 ,10.0.0.2:3306 ,10.0.0.3:3306 | | wsrep_cluster_size | 3 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_ready | ON | +----------------------------+-------------------------------------------+ You can see here the most important information for your Galera Cluster status. Pierre Mavro www.enovance.com Galera Cluster Introduction 26 / 34
  26. Galera Cluster Introduction: Initialization Galera status wsrep_local_send_queue_avg: Average length of

    the send queue since the last status query. When cluster experiences network throughput issues or replication throttling this value will be significantly bigger than 0. wsrep_local_recv_queue_avg: Average length of the receive queue since the last status query. When this number is bigger than 0 this means node can’t apply writesets as fast as they’re received. This could be sign that this node is overloaded and it will cause the replication throttling. wsrep_flow_control_paused: Time since the last status query that replication was paused due to flow control. wsrep_local_state_comment: current node status. Available status: Joining (requesting/receiving State Transfer) : the node is currently joining the cluster Donor/Desynced: node is the donor to the node joining the cluster Joined: node has joined the cluster Synced: node is synced with the cluster Pierre Mavro www.enovance.com Galera Cluster Introduction 27 / 34
  27. Galera Cluster Introduction: Initialization Galera status wsrep_incoming_addresses: Shows the comma-separated

    list of incoming node addresses in the cluster. wsrep_cluster_size: Current number of nodes in the cluster. wsrep_cluster_status: replication status. Available status: Primary: the node is in a master state Non-primary: the node is not a master Disconnected: the node is not connected to cluster wsrep_connected: network connectivity for Galera replication wsrep_ready: node ready to handle SQL transactions Pierre Mavro www.enovance.com Galera Cluster Introduction 28 / 34
  28. Galera Cluster Introduction: Initialization Galera status The other way to

    know the status of Galera, is to run this script (thanks to fridim): Get Galera status # galera -status NODE STATUS cluster status: Primary cluster size: 3 Ready: ON connected: ON state comment: Synced -------------------------------------------------------- REPLICATION HEALTH (The lower the better) fraction replication pause: 0.000000 flow control sent: 0 local send queue average: 0.000000 local receive queue average: 0.004253 -------------------------------------------------------- CLUSTER INTEGRITY (should be the same on all nodes) local state UUID: 05745 e78 -989f -11e2 -0800 - aa6f19ca749c cluster conf ID: 1371 Pierre Mavro www.enovance.com Galera Cluster Introduction 29 / 34
  29. Galera Cluster Introduction: Recover and Troubleshoot Plan 4 Recover and

    Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 High load traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Full reboot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Pierre Mavro www.enovance.com Galera Cluster Introduction 30 / 34
  30. Galera Cluster Introduction: Recover and Troubleshoot High load traffic If

    there is a high load traffic on your servers, the wsrep_flow_control_paused can grow up to 1. This is generally due to an overload of outgoing traffic (wsrep_local_send_queue_avg ) or ingoing traffic (wsrep_local_recv_queue_avg). If one of your Galera node takes long time to answers others and slowdown your cluster, follow those steps: Look at the current running queries why is it slow Look at the Galera logs what’s happening and try to correct manually Move this node out of the load balancer to avoid incoming traffic and let it finish operation more smoothly If all the nodes are too slow because of one, you should consider rebooting MariaDB on this last one to recover a normal state. Pierre Mavro www.enovance.com Galera Cluster Introduction 31 / 34
  31. Galera Cluster Introduction: Recover and Troubleshoot Plan 4 Recover and

    Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 High load traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Full reboot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Pierre Mavro www.enovance.com Galera Cluster Introduction 32 / 34
  32. Galera Cluster Introduction: Recover and Troubleshoot Full reboot If for

    any bad reasons you need to completely reboot or start your Galera cluster, you need to start it as seen above during the initialization state. That means you have to initialize a node: Initialize Galera Cluster # service mysql start --wsrep_cluster_address =’gcomm ://’ and start all others normally: Initialize Galera Cluster # service mysql start Pierre Mavro www.enovance.com Galera Cluster Introduction 33 / 34