transparent distribution of components, emphasis lies on the coordination of activities between those components. • Clean separation between computation and coordination • The coordination part of a distributed system handles the communication and cooperation between processes. • The focus is on how coordination between the processes takes place.
Direct coordination - Know addresses and/or names of processes - Know the form of the messages - Must be up at the same time • Referential coupled, Temporal Decoupled: Mailbox coordination - Like persistent message-oriented communication - Know form of messages in advance - Does not have to be up at the same time
Meeting oriented coordination - Often implemented by events - Publish/subscribe with no persistence on dataspace - Meet at a certain time to coordinate • Referential Decoupled, Temporal Decoupled: generative communication - Independent processes make use of a shared persistent dataspace of tuples - Don't agree on structure of tuples in advance - Tuple tag distinguish between tuples of different info - Publish/subscribe with persistence on dataspace
attributes • Subscription is passed to the middleware with a description of the data items that the subscriber is interested in. • Subscription description as (attribute, value) pairs with range of each attribute and predicates like SQL. • When data items are found the middleware sends a notification to the subscribers to read or just send the data items directly (no storage if sent directly) • Publishing processes publish events which are data items with 1 attribute • Matching data items against subscriptions should be efficient and scalable.
a typed template tuple for matching. • A field in the template tuple either contains a reference to an actual object or contains the value NULL. • Two fields match if they have a copy of the same reference or if the template tuple field is NULL. • A tuple instance matches a template tuple if they have the same fields. • Read and Take (remove tuple after reading) block the caller. • Some implementations return immediately if there is not matching tuple or a timeout can be set. • Centralized implementations makes complex matching rules easier and also can be used for synchronization.
published tuples to subscribers using multicasting • Data item is a message tagged with a compound keyword describing its content (subject) • Uses broadcast or uni cast if the subscribers addresses are known • Each host has a rendezvous daemon, which takes care that messages are sent and delivered according to their subject • The daemon has a table of (process, subject), entries and whenever a message on subject S arrives, the daemon checks in its table for local subscribers, and forwards the message to each one. • Can allow complex matching of published data items against subscriptions
data items may be necessary. • Keywords or (attribute, value) pairs are hashed to unique identifiers for published data, which can be efficiently implemented in a DHT-based system. • For more advanced matching rules we can use Gossip-Based Publish/Subscribe Systems
of (attribute, value/range) pairs • Like CAN (Content Addressable network) make float from attributes and organize subscriber nodes into 2 dimentional array of floats to form groups • Cluster nodes into M different groups, such that nodes i and j belong to the same group if and only if their subscriptions Si and Sj intersect. • Each node maintains a list of references to other neighbors (partial view) to know the intersecting subscriptions
is handled by a separate process Pi, which in turn partitions the range of its attribute across multiple processes. • When a data item d is published, it is forwarded to each Pi, where it is stored at the process responsible for the d's value of a.
received a message is problematic. • Two solutions are suggested: 1- That the receiving mobile process saves older messages to make sure it does not receive duplicates 2- We devise routers to keep track of the mobile peers and know which messages they received (harder to implement).
dataspace • When processes are connected, their dataspaces become shared. • Formally, the processes should be member of the same group and use the same group communication protocol. • The local dataspaces of connected processes form a transiently shared dataspace to allow exchange of tuples. • To control how tuples are distributed, dataspaces can do "reactions". A reaction specifies an action to be executed when a tuple matching a given template is found in the local dataspace.
communication. • In wide-area networks the system should be self-organization or content-based routing to ensure that the message reaches only to its intended subscribers.
the message content so it it cuts of routes that do not lead to receivers of this message. • Clients can tell the servers which messages they are interested in so that the servers notify them when they receive a relevant message. This is done in 2 layers where layer 1 consists of a shared broadcast tree connecting the servers using routers • In simple subject-based publish/subscribe using a unique (non-compound) keyword. • We can send each published message to every server like in TIB/Rendezvous.
the network so that routers can compose routing filters. • When a node leaves the system, it should cancel its subscriptions and essentially broadcast this information to all routers. • Comparison of subscriptions and data items to be routed can be computationally expensive.
of subscriptions then we need another way not the simple content routing we have just used. • Express compositions of subscriptions in which a process specifies in a single subscription that it is interested in very different types of data items. • Design routers analogous to rule databases where subscriptions are transformed into rules stating the conditions under which published data should be forwarded.