Duke University Libraries
View Item 
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Unifying Databases and Internet-Scale Publish/Subscribe

    Thumbnail
    View / Download
    3.1 Mb
    Date
    2008-08-01
    Author
    Chandramouli, Badrish
    Advisor
    Yang, Jun
    Repository Usage Stats
    420
    views
    792
    downloads
    Abstract

    With the advent of Web 2.0 and the Digital Age, we are witnessing an unprecedented increase in the amount of information collected, and in the number of users interested in different types of information. This growth means that traditional techniques, where users poll data sources for information of interest, are no longer sufficient. Polling too frequently does not scale, while polling less often may result in users missing important updates. The alternative push technology has long been the goal of publish/subscribe systems, which proactively push updates (events) to users with matching interests (expressed as subscriptions). The push model is better suited for ensuring scalability and timely delivery of updates, important in many application domains: personal (e.g., RSS feeds, online auctions), financial (e.g., portfolio monitoring), security (e.g., reporting network anomalies), etc.

    Early publish/subscribe systems were based on predefined subjects (channels), and were too coarse-grained to meet the specific interests of different subscribers. The second generation of content-based publish/subscribe systems offer greater flexibility by supporting subscriptions defined as predicates over message contents. However, subscriptions are still stateless filters over individual messages, so they cannot express queries across different messages or over the event history. The few systems that support more powerful database-style subscriptions do not address the problem of efficiently delivering updates to a large number of subscribers over a wide-area network. Thus, there is a need to develop next-generation publish/subscribe systems that unify the support for richer database-style subscription queries and flexible wide-area notification. This support needs to be complemented with robust processing and dissemination techniques that scale to high event rates and large databases, as well as to a large number of subscribers over the Internet.

    The main contribution of our work is a collection of techniques to support efficient and scalable event processing and notification dissemination for an Internet-scale publish/subscribe system with a rich subscription model. We investigate the interface between event processing by a database server and notification delivery by a dissemination network. Previous research in publish/subscribe has largely been compartmentalized; database-centric and network-centric approaches each have their own limitations, and simply putting them together does not lead to an efficient solution. A closer examination of database/network interfaces yields a spectrum of new and interesting possibilities. In particular, we propose message and subscription reformulation as general techniques to support stateful subscriptions over existing content-driven networks, by converting them into equivalent but stateless forms. We show how reformulation can successfully be applied to various stateful subscriptions including range-aggregation, select-joins, and subscriptions with value-based notification conditions. These techniques often provide orders-of-magnitude improvement over simpler techniques adopted by current systems, and are shown to scale to millions of subscriptions. Further, the use of a standard off-the-shelf content-driven dissemination interface allows these techniques to be easily deployed, managed, and maintained in a large-scale system.

    Based on our findings, we have built a high-performance publish/subscribe system named ProSem (to signify the inseparability of database processing and network dissemination). ProSem uses our novel techniques for group-processing many types of complex and expressive subscriptions, with a per-event optimization framework that chooses the best processing and dissemination strategy at runtime based on online statistics and system objectives.

    Type
    Dissertation
    Department
    Computer Science
    Subject
    Computer Science
    database
    publish
    subscribe
    network
    processing
    dissemination
    Permalink
    http://hdl.handle.net/10161/698
    Citation
    Chandramouli, Badrish (2008). Unifying Databases and Internet-Scale Publish/Subscribe. Dissertation, Duke University. Retrieved from http://hdl.handle.net/10161/698.
    Collections
    • Duke Dissertations
    More Info
    Show full item record
    Creative Commons License
    This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

    Rights for Collection: Duke Dissertations

     

     

    Browse

    All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

    My Account

    LoginRegister

    Statistics

    View Usage Statistics