TU Darmstadt / ULB / TUbiblio

Large-Scale Content-Based Publish-Subscribe Systems

Mühl, Gero (2002)
Large-Scale Content-Based Publish-Subscribe Systems.
Technische Universität Darmstadt
Dissertation, Erstveröffentlichung

Kurzbeschreibung (Abstract)

Today, the architecture of distributed computer systems is dominated by client/server platforms relying on synchronous request/reply. This architecture is not well suited to implement information-driven applications like news delivery, stock quoting, air traffic control, and dissemination of auction bids due to the inherent mismatch between the demands of these applications and the characteristics of those platforms. In contrast to that, publish/subscribe directly reflects the intrinsic behavior of information-driven applications because communication here is indirect and initiated by producers of information: Producers publish notifications and these are delivered to subscribed consumers by the help of a notification service that decouples the producers and the consumers. Therefore, publish/subscribe should be the first choice for implementing such applications. The expressiveness of the notification selection mechanism used by the consumers to describe the notifications they are interested in is crucial for the flexibility of a notification service. Content-based notification selection is most expressive because it allows to evaluate filter predicates over the whole content of a notification. The advantage in expressiveness compared to channel- or subject-based selection results in increased flexibility facilitating extensibility and change. On the other hand, scalable implementations of content-based notification services are difficult to realize. Indeed, the expressiveness of notification selection must be carefully chosen in large-scale systems, because expressiveness and scalability are interdependent. Hence, the most fundamental problem in the area of content-based publish/subscribe systems is probably the scalable routing of notifications from their producers to their respective consumers. Unfortunately, existing content-based notification services are not mature enough to be used in large-scale, widely-distributed environments. Most existing notification services are either centralized, use flooding, or use simple routing algorithms that assume that each event broker has global knowledge about all active subscriptions. All these approaches exhibit severe scalability problems in large-scale systems. In contrast to that, this thesis concentrates on mechanisms to improve the scalability of content-based routing algorithms and presents more advanced routing algorithms that do not rely on global knowledge. The algorithms presented here exploit similarities between subscriptions by using identity- and covering-tests, and by merging filters. While identity-based routing is a simplified version of covering-based routing, merging-based routing is more advanced because it exploits the concept of filter merging. Furthermore, the idea of imperfect routing algorithms is introduced. The thesis consists of a theoretical and a practical part. The theoretical part presents a formal specification of publish/subscribe systems, a routing framework and a set of routing algorithms, and discusses how the routing optimizations can be broken down to the actual data/filter model. The practical part presents the implementation of the Rebeca notification service which supports advertisements and all the routing algorithms mentioned above. A detailed practical evaluation of the implemented algorithms based upon the prototype is also presented.

Typ des Eintrags: Dissertation
Erschienen: 2002
Autor(en): Mühl, Gero
Art des Eintrags: Erstveröffentlichung
Titel: Large-Scale Content-Based Publish-Subscribe Systems
Sprache: Englisch
Referenten: Bacon, Ph.D. Jean
Berater: Buchmann, Prof. Ph.D Alejandro P.
Publikationsjahr: 22 November 2002
Ort: Darmstadt
Verlag: Technische Universität
Datum der mündlichen Prüfung: 30 September 2002
URL / URN: urn:nbn:de:tuda-tuprints-2746
Kurzbeschreibung (Abstract):

Today, the architecture of distributed computer systems is dominated by client/server platforms relying on synchronous request/reply. This architecture is not well suited to implement information-driven applications like news delivery, stock quoting, air traffic control, and dissemination of auction bids due to the inherent mismatch between the demands of these applications and the characteristics of those platforms. In contrast to that, publish/subscribe directly reflects the intrinsic behavior of information-driven applications because communication here is indirect and initiated by producers of information: Producers publish notifications and these are delivered to subscribed consumers by the help of a notification service that decouples the producers and the consumers. Therefore, publish/subscribe should be the first choice for implementing such applications. The expressiveness of the notification selection mechanism used by the consumers to describe the notifications they are interested in is crucial for the flexibility of a notification service. Content-based notification selection is most expressive because it allows to evaluate filter predicates over the whole content of a notification. The advantage in expressiveness compared to channel- or subject-based selection results in increased flexibility facilitating extensibility and change. On the other hand, scalable implementations of content-based notification services are difficult to realize. Indeed, the expressiveness of notification selection must be carefully chosen in large-scale systems, because expressiveness and scalability are interdependent. Hence, the most fundamental problem in the area of content-based publish/subscribe systems is probably the scalable routing of notifications from their producers to their respective consumers. Unfortunately, existing content-based notification services are not mature enough to be used in large-scale, widely-distributed environments. Most existing notification services are either centralized, use flooding, or use simple routing algorithms that assume that each event broker has global knowledge about all active subscriptions. All these approaches exhibit severe scalability problems in large-scale systems. In contrast to that, this thesis concentrates on mechanisms to improve the scalability of content-based routing algorithms and presents more advanced routing algorithms that do not rely on global knowledge. The algorithms presented here exploit similarities between subscriptions by using identity- and covering-tests, and by merging filters. While identity-based routing is a simplified version of covering-based routing, merging-based routing is more advanced because it exploits the concept of filter merging. Furthermore, the idea of imperfect routing algorithms is introduced. The thesis consists of a theoretical and a practical part. The theoretical part presents a formal specification of publish/subscribe systems, a routing framework and a set of routing algorithms, and discusses how the routing optimizations can be broken down to the actual data/filter model. The practical part presents the implementation of the Rebeca notification service which supports advertisements and all the routing algorithms mentioned above. A detailed practical evaluation of the implemented algorithms based upon the prototype is also presented.

Alternatives oder übersetztes Abstract:
Alternatives AbstractSprache

Today, the architecture of distributed computer systems is dominated by client/server platforms relying on synchronous request/reply. This architecture is not well suited to implement information-driven applications like news delivery, stock quoting, air traffic control, and dissemination of auction bids due to the inherent mismatch between the demands of these applications and the characteristics of those platforms. In contrast to that, publish/subscribe directly reflects the intrinsic behavior of information-driven applications because communication here is indirect and initiated by producers of information: Producers publish notifications and these are delivered to subscribed consumers by the help of a notification service that decouples the producers and the consumers. Therefore, publish/subscribe should be the first choice for implementing such applications. The expressiveness of the notification selection mechanism used by the consumers to describe the notifications they are interested in is crucial for the flexibility of a notification service. Content-based notification selection is most expressive because it allows to evaluate filter predicates over the whole content of a notification. The advantage in expressiveness compared to channel- or subject-based selection results in increased flexibility facilitating extensibility and change. On the other hand, scalable implementations of content-based notification services are difficult to realize. Indeed, the expressiveness of notification selection must be carefully chosen in large-scale systems, because expressiveness and scalability are interdependent. Hence, the most fundamental problem in the area of content-based publish/subscribe systems is probably the scalable routing of notifications from their producers to their respective consumers. Unfortunately, existing content-based notification services are not mature enough to be used in large-scale, widely-distributed environments. Most existing notification services are either centralized, use flooding, or use simple routing algorithms that assume that each event broker has global knowledge about all active subscriptions. All these approaches exhibit severe scalability problems in large-scale systems. In contrast to that, this thesis concentrates on mechanisms to improve the scalability of content-based routing algorithms and presents more advanced routing algorithms that do not rely on global knowledge. The algorithms presented here exploit similarities between subscriptions by using identity- and covering-tests, and by merging filters. While identity-based routing is a simplified version of covering-based routing, merging-based routing is more advanced because it exploits the concept of filter merging. Furthermore, the idea of imperfect routing algorithms is introduced. The thesis consists of a theoretical and a practical part. The theoretical part presents a formal specification of publish/subscribe systems, a routing framework and a set of routing algorithms, and discusses how the routing optimizations can be broken down to the actual data/filter model. The practical part presents the implementation of the Rebeca notification service which supports advertisements and all the routing algorithms mentioned above. A detailed practical evaluation of the implemented algorithms based upon the prototype is also presented.

Englisch
Sachgruppe der Dewey Dezimalklassifikatin (DDC): 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
Hinterlegungsdatum: 17 Okt 2008 09:21
Letzte Änderung: 26 Aug 2018 21:24
PPN:
Referenten: Bacon, Ph.D. Jean
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: 30 September 2002
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen