Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Coordination-Based Systems

Distributed Coordination-Based Systems

Ahmed Magdy

June 01, 2012
Tweet

More Decks by Ahmed Magdy

Other Decks in Science

Transcript

  1. Distributed Systems : Chapter 13
    Distributed
    Coordination-Based
    Systems
    Reference: Distributed Systems Principles and Paradigms 2nd Edition
    Prepared by: Ahmed Magdy Ezzeldin

    View Slide

  2. Outline

    Introduction to Coordination Models

    Taxonomy of coordination models

    Architecture

    Traditional Architectures

    JavaSpaces

    TIB/Rendezvous

    Peer-to-Peer Architectures

    Gossip-Based Publish/Subscribe System

    Discussion

    Mobility and Coordination

    Lime

    Processes

    Communication

    Content-Based Routing

    Supporting Composite Subscriptions

    View Slide

  3. Introduction to Coordination Models

    Instead of concentrating on the transparent
    distribution of components, emphasis lies on the
    coordination of activities between those components.

    Clean separation between computation and
    coordination

    The coordination part of a distributed system
    handles the communication and cooperation between
    processes.

    The focus is on how coordination between the
    processes takes place.

    View Slide

  4. Taxonomy of coordination models
    Taxonomy of coordination models in
    2 dimensions (temporal and referential)

    View Slide

  5. Taxonomy of coordination models [continued]

    Referential coupled, Temporal coupled: Direct
    coordination
    - Know addresses and/or names of processes
    - Know the form of the messages
    - Must be up at the same time

    Referential coupled, Temporal Decoupled: Mailbox
    coordination
    - Like persistent message-oriented communication
    - Know form of messages in advance
    - Does not have to be up at the same time

    View Slide

  6. Taxonomy of coordination models [continued]

    Referential Decoupled, Temporal coupled: Meeting
    oriented coordination
    - Often implemented by events
    - Publish/subscribe with no persistence on dataspace
    - Meet at a certain time to coordinate

    Referential Decoupled, Temporal Decoupled: generative
    communication
    - Independent processes make use of a shared persistent
    dataspace of tuples
    - Don't agree on structure of tuples in advance
    - Tuple tag distinguish between tuples of different info
    - Publish/subscribe with persistence on dataspace

    View Slide

  7. Architecture

    Data items are described by a series of attributes

    Subscription is passed to the middleware with a
    description of the data items that the subscriber is
    interested in.

    Subscription description as (attribute, value) pairs with
    range of each attribute and predicates like SQL.

    When data items are found the middleware sends a
    notification to the subscribers to read or just send the
    data items directly (no storage if sent directly)

    Publishing processes publish events which are data
    items with 1 attribute

    Matching data items against subscriptions should be
    efficient and scalable.

    View Slide

  8. Architecture [continued]

    View Slide

  9. Traditional Architectures

    Centralized client-server architecture

    Adopted by many publish/subscribe systems like
    IBM WebSphere and Sun JMS

    Generative communication models Like Sun Jini
    and JavaSpaces are based on central servers

    View Slide

  10. JavaSpaces

    To read a tuple instance, a process provides a typed
    template tuple for matching.

    A field in the template tuple either contains a reference
    to an actual object or contains the value NULL.

    Two fields match if they have a copy of the same
    reference or if the template tuple field is NULL.

    A tuple instance matches a template tuple if they have
    the same fields.

    Read and Take (remove tuple after reading) block the
    caller.

    Some implementations return immediately if there is not
    matching tuple or a timeout can be set.

    Centralized implementations makes complex matching
    rules easier and also can be used for synchronization.

    View Slide

  11. TIB/Rendezvous

    Instead of central servers we can immediately send
    published tuples to subscribers using multicasting

    Data item is a message tagged with a compound
    keyword describing its content (subject)

    Uses broadcast or uni cast if the subscribers addresses
    are known

    Each host has a rendezvous daemon, which takes care
    that messages are sent and delivered according to their
    subject

    The daemon has a table of (process, subject), entries
    and whenever a message on subject S arrives, the
    daemon checks in its table for local subscribers, and
    forwards the message to each one.

    Can allow complex matching of published data items
    against subscriptions

    View Slide

  12. TIB/Rendezvous [continued]

    View Slide

  13. Peer-to-Peer Architectures

    For scalability, restrictions on describing
    subscriptions and data items may be necessary.

    Keywords or (attribute, value) pairs are hashed
    to unique identifiers for published data, which can
    be efficiently implemented in a DHT-based
    system.

    For more advanced matching rules we can use
    Gossip-Based Publish/Subscribe Systems

    View Slide

  14. Gossip-Based Publish/Subscribe Systems

    A subscription S is a tuple of (attribute, value/range)
    pairs

    Like CAN (Content Addressable network) make float
    from attributes and organize subscriber nodes into 2
    dimentional array of floats to form groups

    Cluster nodes into M different groups, such that nodes i
    and j belong to the same group if and only if their
    subscriptions Si and Sj intersect.

    Each node maintains a list of references to other
    neighbors (partial view) to know the intersecting
    subscriptions

    View Slide

  15. Discussion

    Similar to gossip-based systems

    each attribute a, is handled by a separate process Pi,
    which in turn partitions the range of its attribute across
    multiple processes.

    When a data item d is published, it is forwarded to each
    Pi, where it is stored at the process responsible for the
    d's value of a.

    View Slide

  16. Mobility and Coordination

    To know if a mobile peer received a message is
    problematic.

    Two solutions are suggested:
    1- That the receiving mobile process saves older
    messages to make sure it does not receive duplicates
    2- We devise routers to keep track of the mobile peers
    and know which messages they received (harder to
    implement).

    View Slide

  17. Lime

    In Lime, each mobile process has its own dataspace

    When processes are connected, their dataspaces
    become shared.

    Formally, the processes should be member of the same
    group and use the same group communication protocol.

    The local dataspaces of connected processes form a
    transiently shared dataspace to allow exchange of tuples.

    To control how tuples are distributed, dataspaces can do
    "reactions". A reaction specifies an action to be executed
    when a tuple matching a given template is found in the
    local dataspace.

    View Slide

  18. Lime [continued]

    View Slide

  19. Processes

    Nothing special we just need efficient
    mechanisms to be used to search in a large
    collection of data.

    The main problem is devising schemes that work
    well in distributed environments.

    View Slide

  20. Communication

    In Java remote method invocations is used for
    communication.

    In wide-area networks the system should be
    self-organization or content-based routing to
    ensure that the message reaches only to its
    intended subscribers.

    View Slide

  21. Content-Based Routing

    Routers can take routing decisions based on the
    message content so it it cuts of routes that do not lead to
    receivers of this message.

    Clients can tell the servers which messages they are
    interested in so that the servers notify them when they
    receive a relevant message. This is done in 2 layers
    where layer 1 consists of a shared broadcast tree
    connecting the servers using routers

    In simple subject-based publish/subscribe using a
    unique (non-compound) keyword.

    We can send each published message to every server
    like in TIB/Rendezvous.

    View Slide

  22. Content-Based Routing [continued]
    Or let every server broadcast its subscriptions to
    all other servers to be able to compile a list of
    (subject, destination) pairs.

    View Slide

  23. Content-Based Routing [continued]

    Each server broadcasts its subscription across
    the network so that routers can compose routing
    filters.

    When a node leaves the system, it should
    cancel its subscriptions and essentially broadcast
    this information to all routers.

    Comparison of subscriptions and data items to
    be routed can be computationally expensive.

    View Slide

  24. Supporting Composite Subscriptions

    When we use more sophisticated expressions of
    subscriptions then we need another way not the
    simple content routing we have just used.

    Express compositions of subscriptions in which
    a process specifies in a single subscription that it
    is interested in very different types of data items.

    Design routers analogous to rule databases
    where subscriptions are transformed into rules
    stating the conditions under which published data
    should be forwarded.

    View Slide

  25. Thank you

    View Slide