Slide 1

Slide 1 text

Replication Smackdown EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Ronald Bradford http://ronaldbradford.com 2016.10 Tuesday, October 4, 16

Slide 2

Slide 2 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Agenda Replication What, Why, How, MySQL Types Production Systems Requirements, Needs, MTBF A New Mindset Availability, Classification, Pipeline Tuesday, October 4, 16

Slide 3

Slide 3 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Which is the best replication approach to use? Answer: It depends Tuesday, October 4, 16

Slide 4

Slide 4 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Which is the best replication approach to use? Answer: It depends Tuesday, October 4, 16

Slide 5

Slide 5 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What is replication? Tuesday, October 4, 16

Slide 6

Slide 6 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What? Tuesday, October 4, 16

Slide 7

Slide 7 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What? Tuesday, October 4, 16

Slide 8

Slide 8 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What? Copy of data Full or partial Tuesday, October 4, 16

Slide 9

Slide 9 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What? Copy of data Full or partial Translation of data e.g. Oracle to MySQL Tuesday, October 4, 16

Slide 10

Slide 10 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What? Copy of data Full or partial Translation of data e.g. Oracle to MySQL Transformation of data e.g. MySQL to DW e.g. MySQL to Hadoop Tuesday, October 4, 16

Slide 11

Slide 11 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive ONE WAY FORM Tuesday, October 4, 16

Slide 12

Slide 12 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive TWO WAY FORM Tuesday, October 4, 16

Slide 13

Slide 13 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Why use replication? Tuesday, October 4, 16

Slide 14

Slide 14 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Why? Redundancy (copies) Availability (distribution) Failover Scalability (read/write) Performance (optimizations) Backups (locking, load) Consolidation Tuesday, October 4, 16

Slide 15

Slide 15 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are production requirements? Tuesday, October 4, 16

Slide 16

Slide 16 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive PROD REQS Acceptable latency Acceptable throughput Performance under load High availability Failover Disaster recovery Security Backups Load testing Monitoring Alerting Sizing How does replication help achieve production requirements? Tuesday, October 4, 16

Slide 17

Slide 17 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What does MySQL offer for replication? Tuesday, October 4, 16

Slide 18

Slide 18 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Types [Traditional] MySQL Replication Asynchronous / Semi-synchronous MySQL Cluster Galera (MySQL/Percona/MariaDB) MySQL Group Replication (RC) / InnoDB Cluster (TBD) Amazon RDS MAZ & Aurora Others (e.g. Google Cloud SQL, Clustrix, DRBD) Tuesday, October 4, 16

Slide 19

Slide 19 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are barriers to usage? Tuesday, October 4, 16

Slide 20

Slide 20 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Traditional async - since 3.23 lag drift consistency throughput semi-sync - since 5.5 lag drift consistency throughput Tuesday, October 4, 16

Slide 21

Slide 21 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive MySQL CLuster Additional installation complexity Data/SQL/Admin nodes Different admin interface Different backup strategy LAN based Same SQL syntax Limited large join options Tuesday, October 4, 16

Slide 22

Slide 22 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Galera Syntax limitations Feature limitation (e.g. MEMORY table), primary key OS Limitations (Linux Only) Hot data spots /Large transactions in multi-master write Timeouts Schema upgrades LAN v WAN Tuesday, October 4, 16

Slide 23

Slide 23 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Group replication Syntax Limitations, Feature Limitation Hot Spots/Large transactions in multi-master write Supported on Linux, Windows, Solaris, FreeBSD, OSX Requirements MySQL 5.7, GTID, binlog_format=ROW Other configuration settings RC only Tuesday, October 4, 16

Slide 24

Slide 24 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive INNODB CLUSTER Only in labs Based on group replication limitations Helps solve the routing problem Simplified orchestration in JS Tuesday, October 4, 16

Slide 25

Slide 25 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are improvements to replication? Tuesday, October 4, 16

Slide 26

Slide 26 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive MySQL Versions 5.6 crash save slaves, GTID, group commit, multi-threaded 5.7 semi-sync improvements, multi-threaded V2, multi-source, XA support, group replication Tuesday, October 4, 16

Slide 27

Slide 27 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive GTID MySQL Percona MariaDB Custom Others http://code.openark.org/blog/mysql/refactoring-replication-topology-with-pseudo-gtid https://mariadb.com/kb/en/mariadb/gtid/ https://www.facebook.com/notes/mysql-at-facebook/lessons- from-deploying-mysql-gtid-at-scale/10152252699590933/ Does not improve async/semi sync replication? Improves [faster] failover Tuesday, October 4, 16

Slide 28

Slide 28 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive multi-threaded parallel schema applier (5.6) parallel query applier (5.7) Does improve async/semi sync replication? Improves performance (i.e. lag) Does not eliminate lag Tuesday, October 4, 16

Slide 29

Slide 29 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive How do we apply replication to our systems? Tuesday, October 4, 16

Slide 30

Slide 30 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Tuesday, October 4, 16

Slide 31

Slide 31 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are your business needs? Tuesday, October 4, 16

Slide 32

Slide 32 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are your business needs? What are [ideal] business needs? Tuesday, October 4, 16

Slide 33

Slide 33 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What are your business needs? What are [ideal] business needs? What are [acceptable] business needs? Tuesday, October 4, 16

Slide 34

Slide 34 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Objectives Mean Time Between Failure (MTBF) Mean Time To Detect (MTTD) Mean Time To Recover (MTTR) Recovery Point Objective (RPO) Recovery Time Objective (RTO) Tuesday, October 4, 16

Slide 35

Slide 35 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive The new mindset in architecture Tuesday, October 4, 16

Slide 36

Slide 36 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive 1. Availability 2. Classification 3. Pipeline Tuesday, October 4, 16

Slide 37

Slide 37 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Availability Tuesday, October 4, 16

Slide 38

Slide 38 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive data availability Ability to write data Ability to read data Ability to [read|write] cached data Ability to operate with no data Tuesday, October 4, 16

Slide 39

Slide 39 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Tuesday, October 4, 16

Slide 40

Slide 40 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What do I mean? Tuesday, October 4, 16

Slide 41

Slide 41 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What do I mean? What is your definition of downtime? Tuesday, October 4, 16

Slide 42

Slide 42 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Availability Not database availability e.g. those maintenance windows Not data availability e.g. Write/Read/Cache/None It is all about service availability i.e. endpoints Tuesday, October 4, 16

Slide 43

Slide 43 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Class Tuesday, October 4, 16

Slide 44

Slide 44 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Class Tuesday, October 4, 16

Slide 45

Slide 45 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Class All data in a RDBMS schema (or even table) is not equal Users data (register, modify) Change password Last login date Add content Comment/Rate/Score Tuesday, October 4, 16

Slide 46

Slide 46 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Class All data in a RDBMS schema (or even table) is not equal Users data (register, modify) Change password Last login date Add content Comment/Rate/Score Current Order Last Order Historical Orders Credit Card Details Tuesday, October 4, 16

Slide 47

Slide 47 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive DATA Class Some data needs to be more highly available than other data Some data access requires more responsiveness than others Some data has acceptable data loss Some data can be unavailable some of the time Some data visibility can vary between users All data should be secure, some more secure Tuesday, October 4, 16

Slide 48

Slide 48 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive DATA Class Some data needs to be more highly available than other data Some data access requires more responsiveness than others Some data has acceptable data loss Some data can be unavailable some of the time Some data visibility can vary between users All data should be secure, some more secure Reclassification of data changes replication requirements Tuesday, October 4, 16

Slide 49

Slide 49 text

Acceptable latency Acceptable throughput Performance under load High availability Failover Disaster recovery Security EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive PROD systems Requirements MTBF MTTD MTTR RPO RTO Responsibilities Tuesday, October 4, 16

Slide 50

Slide 50 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Example (Financial) Tuesday, October 4, 16

Slide 51

Slide 51 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Class Single greatest feature loss Referential Integrity A & C of ACID Tuesday, October 4, 16

Slide 52

Slide 52 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Data Pipeline Tuesday, October 4, 16

Slide 53

Slide 53 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive data pipeline A single request does not produce one synchronous response Data is not stored in one RDBMS or product type Data locality for responsiveness Use product strengths for data manipulation Tuesday, October 4, 16

Slide 54

Slide 54 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Rethinking how to choose MySQL Replication Tuesday, October 4, 16

Slide 55

Slide 55 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive UTF-FTV you-twit-face-flix-talk-vr The next great social experience Tuesday, October 4, 16

Slide 56

Slide 56 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Old way Found in many “traditional” frameworks/OSS products Users table in monolithic schema Synchronous web requests for information Polling for new/streaming information Data for application is available or not available Replication enables read scalability only Tuesday, October 4, 16

Slide 57

Slide 57 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive New Way stores Highly available synchronous store Columnar store Messaging (PUB/SUB) Graph Queue Search Tuesday, October 4, 16

Slide 58

Slide 58 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive User path Microservices Login/Logout Register/Maintain Log actions (login good/bad, click, mouse movement) Friends Friends interactions Availability: What type of data access is available Tuesday, October 4, 16

Slide 59

Slide 59 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive New WAy Graph Store Friends Customized for relevancy/strength algorithm Queue Password changes, User profile changes Lost password Tuesday, October 4, 16

Slide 60

Slide 60 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive notificationS User actions (success, failure) are published (PUB) (SUB) Subscriber logs information (SUB) Subscriber audits for unexpected behavior (SUB) Subscriber notifies friends user is online/offline Pipeline of multiple asynchronous actions Tuesday, October 4, 16

Slide 61

Slide 61 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive RATE ME Rate a comment Client only feedback Action held on client batched, overloaded, timed transmission Supports rate/unrate (client side only) Class: Optimized for payload Tuesday, October 4, 16

Slide 62

Slide 62 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive Conclusion Tuesday, October 4, 16

Slide 63

Slide 63 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive CONCLUSION Do you choose a replication approach to match your [ideal] business needs and data store(s) OR Do you architect a data infrastructure to meet your [ideal] business needs and target specific replication (aka availability) approaches where applicable Tuesday, October 4, 16

Slide 64

Slide 64 text

EffectiveMySQL.com - Performance, Scalability, Site Reliability @RonaldBradford #PerconaLive What does this have to do with replication? Answer: Everything Tuesday, October 4, 16