The wide popularity of free-and-easy keyword based searches over World Wide Web has fueled the demand for incorporat- ing keyword-based search over structured databases. How- ever, most of the current research work focuses on keyword- based searching over a single structured data source. With the growing interest in distributed databases and service ori- ented architecture over the Internet, it is important to ex- tend such a capability over multiple structured data sources. One of the most important problems for enabling such a query facility is to be able to select the most useful data sources relevant to the keyword query. Traditional database summary techniques used for selecting unstructured data sources developed in IR literature are inadequate for our problem, as they do not capture the structure of the data sources. In this paper, we study the database selection prob- lem for relational data sources, and propose a method that effectively summarizes the relationships between keywords in a relational database based on its structure. We develop effective ranking methods based on the keyword relationship summaries in order to select the most useful databases for a given keyword query. We have implemented our system on PlanetLab. In that environment we use extensive experi- ments with real datasets to demonstrate the e ectiveness of our proposed summarization method.
展开▼