Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Redis multi-data center two-way synchronization

halfrost
October 31, 2019

Redis multi-data center two-way synchronization

halfrost

October 31, 2019
Tweet

More Decks by halfrost

Other Decks in Programming

Transcript

  1. 1 开篇 2 经典 Redis架构 3 4 目 录 CONTENTS

    分布式理论 双向/多向同步的问题 CRDT 5
  2. 19世纪的通讯 “At 12:30 am on April 4th, 1841 President William

    Henry Harrison died of pneumonia just a month after taking office. The Richmond Enquirer published the news of his death two days later on April 6th. The North- Carolina standard newspaper published it on April 14th. His death wasn’t known of in Los Angeles until July23rd, 110 days after it hadoccurred.”
  3. 1881 年的一张地图 展示了一条消息从伦敦出发 • 绿色的区域可以在 10 天以 内到达 • 黄色的区域需要

    10-20 天 • 粉色的区域需要 20-30 天 • 蓝色的区域需要 30-40 天 • 棕色的区域需要 40 天以上 的时间
  4. DRC 的概念是在近年来,云计算兴起, 多站点部署的场景下,延伸出来对于数 据共享的一个需求. 多站点部署的架构, 对于单元化部署的 应用来讲, 跨数据中心的数据访问一直 是一个最大的痛点. 目前很多用户

    • 抑或是采取了同一份写入到两个站 点的数据库 • 抑或是跨站点写入数据库同时同步 回来(例如 AWS 的AURORA) 这两种方式都没有从根本上解决问题, DRC 概念的出现,让大家对分布式存储 又有了新的期待 Data Replication Center
  5. Availability Strong Eventually Consistency Partition 技 术 选 型 首先,P(网络分区)是首

    要考虑因素 其次,跨区域部署就是 为了提高可用性 最后,我们使用" 最终一 致性" 来解决数据冲突
  6. Redis A: •set k v Redis B: •set kv 发生在端对端的互相同步过程中

    假设有两个 Redis: A 和 B • A 收到客户端的请求:set kv • A 将请求通知到 B • B 收到请求后,再次通知 A 解决方案: 标记客户端类型 双向回环 client Set kv
  7. set kv set kv set kv 复制回环 发生在多个点的互相同步过程 中 与双向回环的不同点在于

    如果标记了客户端来源,则无 法处理 A -> B -> C -> A 的问题 解决方案: • 标记数据来源 • 只转发来自应用的数据(不 转发复制过来的数据) 复制回环 client Set kv
  8. 多站点之间的数据强最终一致性 SEC ---CRDT (Strong Eventually Consistency) Whereas eventual consistency is

    only a liveness guarantee (updates will be observed eventually), strong eventual consistency (SEC) adds the safety guarantee that any two nodes that have received the same (unordered) set of updates will be in the same state. If, furthermore, the system is monotonic, the application will never suffer rollbacks.
  9. Conflict-Free Data Types Wiki释义 In distributed computing, a conflict-free replicated

    data type (CRDT) is a data structure which can be replicated across multiple computers in a network, where the replicas can be updated independently and concurrently without coordination between the replicas, and where it is always mathematically possible to resolve inconsistencies which mightresult.
  10. 并发冲突 举个栗子: LWW(Last Writer Wins)-Register: 适用于 K/V 类型的存储 解决数据冲突的方式是通过使用 unix

    timestamp 或类似自然时间的计数 方 法,来达到数据的最终一致性 CRDT 可以做什么
  11. State-based CRDTs are called convergent replicated data types, or CvRDTs.

    In contrast to CmRDTs, CvRDTs send their full local state to other replicas, where the states are merged by a function which must be commutative,associative, and idempotent. State-based Replication 发送端将自身的 全量状态 发送给接收端, 接 收端执行 merge 操作, 来达到和发送端状态 一致的结果 State-base replication 适用于不稳定的网络 系统, 通常会有多次重传 要求数据结构能够支持 交换律/结合律/幂等 性 这些特性 State-based Replication
  12. Operation-based CRDTs are referred to as commutative replicated data types,

    or CmRDTs. CmRDT replicas propagate state by transmitting only the update operation. For example, a CmRDT of a single integer might broadcast the operations (+10) or (−20). 发送端将状态的改变转换为 操作/Log 的形式发送 给接收端, 接收端执行 update 操作, 来达到和发送 端状态一直的结果 Op-based replication 只要求数据结构满足 commutative 的特性,不要求 idempotent Operation-based Replication
  13. State-based Replication • 通常是基于全量状态进行同步,这样的结果是造成的网络流量 太大,且同步的效率低下.在同步机制已经建立的系统中,我们 更倾向于使用 Op-based replication,以达到节省流量和快速 同步 的目的

    Op-based Replication • 基于 unbounded resource 的假设上进行论证的学术理念,在 实 践过程中,不可能有无限大的存储资源,将某个站点的全部 数 据缓存下来,这样就带来一个问题, 如果新加节点或者网络 断 开过久时,我们的存储资源不足以缓存所有历史的操作,从 而 使得复制操作无法进行.此时,我们需要借助 State-based replication 进行多个站点之间,状态的merge操作 CRDT Replication
  14. 增量同步 • Redis master 接收到客户端的操作,将对数据库产 生 修改的操作转发送给slave,slave 执行和 master 同样

    的操作,达到master-slave数据一致的目的 全量同步 • master将自身数据库以快照形式(RDB文件)发送给 slave, slave通过加载快照文件,达到和 master数据 一致的目 的 • 适用于新添加 slave或同步缓冲区溢出时,master与 slave 同步 Redis Master-Slave Replication
  15. FullSync • 由于物理上的限制,一台机器不可能无限制地 Hold 所有的操作历史.在 新节点加入的情况下,State-Based Replication 就比较适合这种场景. PartiallySync •

    而由于 Redis 的实现本身具有增量同步的特性,那么,Operation- based Replication就很适合这种场景,不用把系统的整个状态发送出 去,而是 仅仅发送一个 op-log AdvancedFullSync • 然而, 对于断点续传的场景中, 我们缓存 op-log 的 buffer 可能已经不 够 用, 但此时对端由 hold 了一部分的历史信息. 这时候, Delta-based state replication就比较合适 Redis CRDT Replication实 现
  16. Key/Value • LWW Register Map • Add Wins • LWW

    Expire • LargerTTL Wins GC CRDT 在 Redis 中的实现
  17. 正常同步的场景 Data Type: Strings Use Case: Common SETs Conflict Resolution:None

    并发冲突的场景 Data Type:Strings Use Case: Concurrent SETs Conflict Resolution: Last Write Wins (LWW) Redis String
  18. 正常同步的场景 Data Type:Maps Use Case: Common HSET Conflict Resolution:None 并发冲突的场景

    - 1 Data Type:Maps Use Case: Concurrent HSET Conflict Resolution: ADD WINS Redis Map
  19. 并发冲突的场景 - 2 Data Type:Maps Use Case: ConcurrentHSET Conflict Resolution:

    LWW – Last Write Wins 并发冲突的场景 - 3 Data Type:Maps Use Case: ConcurrentHSET Conflict Resolution: ADD WINS && LWW – Last Write Wins Redis Map
  20. 正常同步的场景 Data Type:Strings Use Case: Common EXPIRATION Conflict Resolution: None

    并发冲突的场景 Data Type:Strings Use Case: Concurrent EXPIRATION Conflict Resolution: Larger TTL wins Exipre 操作产生并发冲突时,我们采用 Larger TTL Wins 的策略 Redis String Expiration
  21. Data Replication Center CRDT 的未来 面对大型分布式系统,Consistency/Availability/Partition在跨区域多活的场景下如何取舍? 显然P(网 络分区)是首要考虑因素。 其次,跨区域部署就是为了提高可用性,而且对于常见的一致性协议,不管是2PC、Paxos还是raft,在 此

    场景下都要做跨区域同步更新,不仅会降低用户体验,在网络分区的时候还会影响可用性,因此C必 定被排在最后。那是不是C无法被满足了呢? Conflict-free Replicated Data Types
  22. CRDT⼊门 A CRDT Primer Part I: Defanging Order Theory A

    CRDT Primer Part II: Convergent CRDTs CRDT相关论文 •重 点 推 荐 :A comprehensive study of Convergent and Commutative Replicated Data Types •Conflict-free replicated data types •Delta State Replicated Data Types •CRDTs: Making δ-CRDTs Delta-Based •Key-CRDT Stores •A Conflict-Free Replicated JSON Datatype •OpSets: Sequential Specifications for ReplicatedDatatypes
  23. 系列讲座 Talks RedisConf18: CRDTs and Redis—From Sequential to Concurrent Executions

    by Carlos Baquero QCon London 2018: CRDTs and the Quest for Distributed Consistency by Martin Kleppmann “CRDTs Illustrated” by Arnout Engelen Coding CRDT Dmitry Ivanov & Nami Naserazad - Practical Demystification of CRDT (Lambda Days 2016) ElixirConf 2015 - CRDT: Datatype for the Apocalypse by Alexander Songe GOTO 2016 • Conflict Resolution for Eventual Consistency • Martin Kleppmann CRDTs in IPFS Journal Club - 2018 06 13 CRDT JSON Datatype, by Gonçalo Pestana Notes and blog posts CRDT Tutorial forBeginners Conflict-Free Replicated Data Types (CRDTs), An Offline Camp passion talk CRDT Notes by Paul Frazee Towards a unified theory of Operational Transformation and CRDT by Raph Levien A simple approach to building a real-time collaborative text editor Data Laced with History: Causal Trees & Operational CRDTs