Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Coordination-Based Systems

Distributed Coordination-Based Systems

Ahmed Magdy

June 01, 2012

More Decks by Ahmed Magdy

Other Decks in Science


  1. Distributed Systems : Chapter 13 Distributed Coordination-Based Systems Reference: Distributed

    Systems Principles and Paradigms 2nd Edition Prepared by: Ahmed Magdy Ezzeldin
  2. Outline ➔ Introduction to Coordination Models ➔ Taxonomy of coordination

    models ➔ Architecture ➔ Traditional Architectures ➔ JavaSpaces ➔ TIB/Rendezvous ➔ Peer-to-Peer Architectures ➔ Gossip-Based Publish/Subscribe System ➔ Discussion ➔ Mobility and Coordination ➔ Lime ➔ Processes ➔ Communication ➔ Content-Based Routing ➔ Supporting Composite Subscriptions
  3. Introduction to Coordination Models • Instead of concentrating on the

    transparent distribution of components, emphasis lies on the coordination of activities between those components. • Clean separation between computation and coordination • The coordination part of a distributed system handles the communication and cooperation between processes. • The focus is on how coordination between the processes takes place.
  4. Taxonomy of coordination models [continued] • Referential coupled, Temporal coupled:

    Direct coordination - Know addresses and/or names of processes - Know the form of the messages - Must be up at the same time • Referential coupled, Temporal Decoupled: Mailbox coordination - Like persistent message-oriented communication - Know form of messages in advance - Does not have to be up at the same time
  5. Taxonomy of coordination models [continued] • Referential Decoupled, Temporal coupled:

    Meeting oriented coordination - Often implemented by events - Publish/subscribe with no persistence on dataspace - Meet at a certain time to coordinate • Referential Decoupled, Temporal Decoupled: generative communication - Independent processes make use of a shared persistent dataspace of tuples - Don't agree on structure of tuples in advance - Tuple tag distinguish between tuples of different info - Publish/subscribe with persistence on dataspace
  6. Architecture • Data items are described by a series of

    attributes • Subscription is passed to the middleware with a description of the data items that the subscriber is interested in. • Subscription description as (attribute, value) pairs with range of each attribute and predicates like SQL. • When data items are found the middleware sends a notification to the subscribers to read or just send the data items directly (no storage if sent directly) • Publishing processes publish events which are data items with 1 attribute • Matching data items against subscriptions should be efficient and scalable.
  7. Traditional Architectures • Centralized client-server architecture • Adopted by many

    publish/subscribe systems like IBM WebSphere and Sun JMS • Generative communication models Like Sun Jini and JavaSpaces are based on central servers
  8. JavaSpaces • To read a tuple instance, a process provides

    a typed template tuple for matching. • A field in the template tuple either contains a reference to an actual object or contains the value NULL. • Two fields match if they have a copy of the same reference or if the template tuple field is NULL. • A tuple instance matches a template tuple if they have the same fields. • Read and Take (remove tuple after reading) block the caller. • Some implementations return immediately if there is not matching tuple or a timeout can be set. • Centralized implementations makes complex matching rules easier and also can be used for synchronization.
  9. TIB/Rendezvous • Instead of central servers we can immediately send

    published tuples to subscribers using multicasting • Data item is a message tagged with a compound keyword describing its content (subject) • Uses broadcast or uni cast if the subscribers addresses are known • Each host has a rendezvous daemon, which takes care that messages are sent and delivered according to their subject • The daemon has a table of (process, subject), entries and whenever a message on subject S arrives, the daemon checks in its table for local subscribers, and forwards the message to each one. • Can allow complex matching of published data items against subscriptions
  10. Peer-to-Peer Architectures • For scalability, restrictions on describing subscriptions and

    data items may be necessary. • Keywords or (attribute, value) pairs are hashed to unique identifiers for published data, which can be efficiently implemented in a DHT-based system. • For more advanced matching rules we can use Gossip-Based Publish/Subscribe Systems
  11. Gossip-Based Publish/Subscribe Systems • A subscription S is a tuple

    of (attribute, value/range) pairs • Like CAN (Content Addressable network) make float from attributes and organize subscriber nodes into 2 dimentional array of floats to form groups • Cluster nodes into M different groups, such that nodes i and j belong to the same group if and only if their subscriptions Si and Sj intersect. • Each node maintains a list of references to other neighbors (partial view) to know the intersecting subscriptions
  12. Discussion • Similar to gossip-based systems • each attribute a,

    is handled by a separate process Pi, which in turn partitions the range of its attribute across multiple processes. • When a data item d is published, it is forwarded to each Pi, where it is stored at the process responsible for the d's value of a.
  13. Mobility and Coordination • To know if a mobile peer

    received a message is problematic. • Two solutions are suggested: 1- That the receiving mobile process saves older messages to make sure it does not receive duplicates 2- We devise routers to keep track of the mobile peers and know which messages they received (harder to implement).
  14. Lime • In Lime, each mobile process has its own

    dataspace • When processes are connected, their dataspaces become shared. • Formally, the processes should be member of the same group and use the same group communication protocol. • The local dataspaces of connected processes form a transiently shared dataspace to allow exchange of tuples. • To control how tuples are distributed, dataspaces can do "reactions". A reaction specifies an action to be executed when a tuple matching a given template is found in the local dataspace.
  15. Processes • Nothing special we just need efficient mechanisms to

    be used to search in a large collection of data. • The main problem is devising schemes that work well in distributed environments.
  16. Communication • In Java remote method invocations is used for

    communication. • In wide-area networks the system should be self-organization or content-based routing to ensure that the message reaches only to its intended subscribers.
  17. Content-Based Routing • Routers can take routing decisions based on

    the message content so it it cuts of routes that do not lead to receivers of this message. • Clients can tell the servers which messages they are interested in so that the servers notify them when they receive a relevant message. This is done in 2 layers where layer 1 consists of a shared broadcast tree connecting the servers using routers • In simple subject-based publish/subscribe using a unique (non-compound) keyword. • We can send each published message to every server like in TIB/Rendezvous.
  18. Content-Based Routing [continued] Or let every server broadcast its subscriptions

    to all other servers to be able to compile a list of (subject, destination) pairs.
  19. Content-Based Routing [continued] • Each server broadcasts its subscription across

    the network so that routers can compose routing filters. • When a node leaves the system, it should cancel its subscriptions and essentially broadcast this information to all routers. • Comparison of subscriptions and data items to be routed can be computationally expensive.
  20. Supporting Composite Subscriptions • When we use more sophisticated expressions

    of subscriptions then we need another way not the simple content routing we have just used. • Express compositions of subscriptions in which a process specifies in a single subscription that it is interested in very different types of data items. • Design routers analogous to rule databases where subscriptions are transformed into rules stating the conditions under which published data should be forwarded.