Slide 1

Slide 1 text

GOSSIP INTRODUCTION TO

Slide 2

Slide 2 text

INTRODUCTION TO GOSSIP ABOUT @FLOPEZLUIS • Director of Engineering @ShuttleCloudEng • Co-organizer PawersWeLoveMad and Distributed System Mad • Author of Mastering Python Regular Expressions 2

Slide 3

Slide 3 text

INTRODUCTION TO GOSSIP THE PROBLEM ‣ Each node knows every other node ‣ Traditional master and slave ‣ Paxos or other consensus based algorithms. ‣ BitTorrent based protocol also a P2P approach ‣ Multicast

Slide 4

Slide 4 text

INTRODUCTION TO GOSSIP WHAT’S GOSSIP USED FOR? ‣ Database replication ‣ Information dissemination ‣ Cluster membership ‣ Failure Detectors ‣ Overlay Networks ‣ Aggregations

Slide 5

Slide 5 text

INTRODUCTION TO GOSSIP WHAT DO THEY HAVE IN COMMON? ‣ RIAK ‣ CASSANDRA ‣ DYNAMO ‣ CONSUL ‣ Amazon s3 ‣ Docker Swarm ‣ ElasticSearch ‣ Hazelcast ‣ Redis Cluster ‣ AKKA ‣ Flume (cloudera) ‣ Bitcoin ‣ Dynomite ‣ Tribler ‣ Comcast ‣ ….

Slide 6

Slide 6 text

INTRODUCTION TO GOSSIP 8 FALLACIES OF DISTRIBUTED COMPUTING ‣ 1. The network is reliable. ‣ 2. Latency is zero. ‣ 3. Bandwidth is infinite. ‣ 4. The network is secure. ‣ 5. Topology doesn't change. ‣ 6. There is one administrator. ‣ 7. Transport cost is zero. ‣ 8. The network is homogeneous.

Slide 7

Slide 7 text

INTRODUCTION TO GOSSIP BROADCAST PROTOCOL A primary use of gossip is for information diffusion: some event occurs, and our goal is to spread the word [3]

Slide 8

Slide 8 text

INTRODUCTION TO GOSSIP GOSSIP AND EPIDEMICS Trying to squash a rumor is like trying to unring a bell. 
 ~Shana Alexander Anyone can start a rumor, but none can stop one. 
 ~ American proverb


Slide 9

Slide 9 text

INTRODUCTION TO GOSSIP HOW DO THEY WORK?

Slide 10

Slide 10 text

INTRODUCTION TO GOSSIP STRENGTHS OF GOSSIP ▸ Scalable ▸ Fault-tolerance. ▸ Robust ▸ Convergent consistency. ▸ extremely decentralized form of information discovery. ▸ Little code and complexity

Slide 11

Slide 11 text

INTRODUCTION TO GOSSIP STRENGTHS OF GOSSIP ▸ Ability to operate in networks with irregular and unknown connectivity [3] ▸ Robust ▸ Convergent consistency. O(log(N)) ▸ Gossip offers an extremely decentralized form of information discovery, and its latencies are often acceptable if the information won’t actually be used immediately. [3] ▸ Little code and complexity

Slide 12

Slide 12 text

INTRODUCTION TO GOSSIP FORMAL DEFINITION Many attempts to formally define gossip but there is no standard definition [13]

Slide 13

Slide 13 text

INTRODUCTION TO GOSSIP FORMAL DEFINITION In general they have these properties [4] [13]: ‣ node selection must be random, or at least guarantee enough peer diversity ‣ only local information is available at all nodes ‣ communication is round-based (periodic) ‣ Transmission and processing capacity per round is limited ‣ All nodes run the same protocol

Slide 14

Slide 14 text

INTRODUCTION GOSSIP PROPERTIES ▸ They are randomized algorithms. [1] ▸ They’re not deterministic [3]

Slide 15

Slide 15 text

INTRODUCTION TO GOSSIP EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE The paper “Epidemic Algorithms for Replicated Database Maintenance” [1] (1987) is considered to be seminal. On disseminating information reliably without broadcasting. Proceedings of the International Conference on Distributed Computing Systems (1987), pp. 74–81

Slide 16

Slide 16 text

INTRODUCTION TO GOSSIP EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE ‣ They were trying to build a directory, a lookup database. ‣ The network was unreliable. ‣ Database was replicated at 300 of nodes (or more). ‣ All servers accept updates. ‣ Each update is injected at a single site and propagated to all sites or substituted by a later update ‣ Replicas become consistent after no more new updates.

Slide 17

Slide 17 text

INTRODUCTION TO GOSSIP EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE Gossip protocols literature have adopted some terms from the epidemiology literature [1]: ▸ Infective. 
 A node with an update it is willing to share. ▸ Susceptible. 
 A node that has not received the update yet (It is not infected). ▸ Removed. 
 A node that has already received the update but it is not willing to share it.

Slide 18

Slide 18 text

INTRODUCTION TO GOSSIP EPIDEMIC ALGORITHMS FOR REPLICATED DATABASE MAINTENANCE
 They analysed 3 methods for spreading the updates: ‣ Direct mail ‣ Anti-entropy ‣ Rumor mongering

Slide 19

Slide 19 text

INTRODUCTION TO GOSSIP TYPES OF GOSSIP ▸ Anti-entropy (SI model)
 Simple epidemics. A node is always susceptible or infective. ▸ Rumor Mongering (SIR model)
 Complex epidemics. A node can be susceptible, infective or removed.

Slide 20

Slide 20 text

INTRODUCTION TO GOSSIP MODELLING RUMOR SPREADING ‣ s proportion of nodes remain susceptible when gossip stops. ‣ k average number of times a node sends the update to a peer that already has it.

Slide 21

Slide 21 text

INTRODUCTION TO GOSSIP MODELLING RUMOR SPREADING k =1 this formula suggest that 25% will miss the update , at k=2 only 6% will miss it, for k =5, 0.25%...

Slide 22

Slide 22 text

INTRODUCTION TO GOSSIP STRATEGIES FOR SPREADING THE GOSSIP ▸ PUSH ▸ infective nodes are the ones infecting susceptible nodes. ▸ very efficient where there are few updates.
 ▸ PULL ▸ all nodes are actively pulling for updates. ▸ very efficient where there are many updates.
 ▸ PUSH-PULL ▸ The node and selected node exchange their information.


Slide 23

Slide 23 text

INTRODUCTION TO GOSSIP CAVEATS ▸ Not very efficient. Messages can arrive several times to a node ▸ Too much bandwidth. ▸ Latency ▸ the randomness inherent in many gossip protocols can make it hard to reproduce and debug unexpected problems that arise at runtime ▸ Gossip protocols can’t scale well in some situations

Slide 24

Slide 24 text

INTRODUCTION TO GOSSIP https://blog.shuttlecloud.com/ https://flopezluis.github.io/gossip-simulator/

Slide 25

Slide 25 text

TEXT REFERENCES ▸ [1] A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry. “Epidemic Algorithms for Replicated Database Maintenance.” In Proc. Sixth Symp. on Principles of Distributed Computing, pp. 1–12, Aug. 1987. ACM. ▸ [2] Kermack, W. O.; McKendrick, A. G. (1927). "A Contribution to the Mathematical Theory of Epidemics". Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 115 (772) ▸ [3] Ken Birman. The Promise, and Limitations, of Gossip Protocols. SIGOPS Oper. Syst. Rev., 41(5):8–13, October 2007 ▸ [4] Gossip-based Protocols for Large-scale Distributed Systems. Márk Jelasity, 2013 ▸ [5] J. Leitão, J. Pereira, and L. Rodrigues. Epidemic broadcast trees. In Huai, J. and Baldoni, R. and Yen, I., editor, IEEE International Symposium On Reliable Distributed Systems, pages 301–310. IEEE Computer Society, 2007 ▸ [6] Ali Saidi and Mojdeh Mohtashemi. Minimum-cost first-push-then-pull gossip algorithm. IEEE Wireless Communications and Networking Conference, WCNC, pages 2554–2559, 2012


Slide 26

Slide 26 text

TEXT REFERENCES ▸ [7] JELASITY, M., GUERRAOUI, R., KERMARREC, A.-M., AND VAN STEEN, M. 2004. The peer sampling service: Experimental evaluation of unstructured gossip-based implementations. In Middleware 2004, H.-A. Jacobsen, Ed. Lecture Notes in Computer Science, vol. 3231. Springer- Verlag, 79–98. ▸ [8] http://status.aws.amazon.com/s3-20080720.html ▸ [9] http://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archGossipAbout.html ▸ [10] https://www.consul.io/docs/internals/gossip.html ▸ [11] A Gossip-Style Failure Detection Service: Robbert van Renesse, Yaron Minsky, and Mark Hayden*; Dept. of Computer Science, Cornell University; 4118 Upson Hall, Ithaca, NY 14853 ▸ [12] Gupta, Indranil, Chandra, Tushar D., and Goldszmidt, Germ´an S. On scalable and efficient distributed failure detectors. In Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, PODC ’01, pp. 170–179,New York, NY, USA, 2001. ACM. ISBN 1-58113-383-9. doi: 10.1145/383962.384010. URL http://doi.acm.org/10.1145/383962.384010

Slide 27

Slide 27 text

TEXT REFERENCES ▸ [13] Montresor, A.: Intelligent Gossip. In: Studies on Computational Inteligence, Intelligent Distributed Computing, Systems and Applications, Springer, Heidelberg (2008) ▸ [14] On disseminating information reliably without broadcasting. Proceedings of the International Conference on Distributed Computing Systems (1987), pp. 74–81 ▸ [15] Brenda Baker and Robert Shostak. Gossips and telephones. Discrete Mathematics, 2(3):191–193, June 1972. ▸ [16] http://www.inf.u-szeged.hu/~jelasity/ddm/gossip.pdf ▸ [17] Kermarrec, Anne-Marie, and Steen, Maarten Van, “Gossiping in distributed systems”, ACM SIGOPS Operating Systems Review, Volume 41, Issue 5, Pages: 2 – 7, 2007. ▸ [18] S. Voulgaris, M. Jelasity, M. van Steen, A Robust and Scalable Peer-to-Peer Gossiping Protocol,Lecture Notes in Computer Science (LNCS), vol. 2872 (Springer, Berlin/ Heidelberg, 2004), pp. 47–58. doi:10.1007/b104265

Slide 28

Slide 28 text

TEXT REFERENCES