Top-K data source selection for keyword queries over multiple XML data sources

Khanh Nguyen; Jinli Cao

首页> 外文期刊>Journal of Information Science >Top-K data source selection for keyword queries over multiple XML data sources

【24h】

Top-K data source selection for keyword queries over multiple XML data sources

机译：通过多个XML数据源进行关键字查询的Top-K数据源选择

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the proliferation of XML data, searching XML data using keyword queries has attracted much attention. However, most of the current approaches focus on keyword-based searches over a single XML document Searching over a system integrating hundreds or even thousands of data sources by sequentially querying every single source is extremely costly, and thus may be impractical. In this article we propose a novel approach for selecting the top-K data sources by relying on their relevance to a given query, to avoid the high cost of searching in numerous, potentially irrelevant data sources. Our approach summarizes the data sources as succinct synopses for the rapid filtering of non-promising sources. We maintain both structural and value distribution information of each data source, and propose a novel ranking function to measure effectively the relevance of the data source to the given query. We con ducted experiments with real datasets, and results show that our approach achieves high performances in all evaluation metrics: recall, precision and Spearman's rank correlation coefficient with different experimental parameters.

机译：随着XML数据的激增，使用关键字查询搜索XML数据已引起广泛关注。但是，当前的大多数方法都集中在单个XML文档上的基于关键字的搜索上，通过依次查询每个单个源来搜索集成了数百甚至数千个数据源的系统是非常昂贵的，因此可能是不切实际的。在本文中，我们提出了一种新颖的方法，通过依赖于前K个数据源与给定查询的相关性来选择它们，从而避免了在众多可能不相关的数据源中进行搜索的高昂成本。我们的方法将数据源概括为简要概述，用于快速过滤没有希望的源。我们维护每个数据源的结构和价值分布信息，并提出一种新颖的排名功能，以有效地测量数据源与给定查询的相关性。我们用真实的数据集进行了实验，结果表明我们的方法在所有评估指标上均具有较高的性能：召回率，精度和具有不同实验参数的Spearman秩相关系数。

著录项

来源
《Journal of Information Science》 |2012年第2期|p.156-175|共20页
作者
Khanh Nguyen; Jinli Cao;
展开▼
作者单位

La Trobe University, Australia;

Department of Computer Science and Computer Engineering, La Trobe University, Kingsbury Dr, Bundoora, Victoria 3086, Australia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
database selection; keyword search; query estimation; query pattern; XML data; XML query;

机译：数据库选择;关键词搜索;查询估计;查询模式XML数据;XML查询;
入库时间 2022-08-17 23:20:51

相似文献

外文文献
中文文献
专利

1. A new Top-k query processing algorithm to guarantee confidentiality of data and user queries on outsourced databases [J] . Hyeong-Jin Kim, Jae-Woo Chang International journal of systems assurance engineering and management . 2019,第5期

机译：一种新的Top-k查询处理算法，可确保外包数据库中数据和用户查询的机密性
2. CLASCN: Candidate Network Selection for Efficient Top-k Keyword Queries over Databases [J] . Jun Zhang, Zhao-Hui Peng, Shan Wang, 计算机科学技术学报（英文版） . 2007,第002期

机译：CLASCN：在数据库上进行有效的前k个关键字查询的候选网络选择
3. Privacy-preserving top-k keyword similarity search over outsourced cloud data [J] . Teng Yiping, Cheng Xiang, Su Sen, Communications, China . 2015,第12期

机译：通过外包的云数据保护隐私的top-k关键字相似性
4. K-Graphs: Selecting Top-k Data Sources for XML Keyword Queries [C] . Khanh Nguyen, Jinli Cao International conference on database and expert systems applications;DEXA 2011 . 2011

机译：K图：为XML关键字查询选择前k个数据源
5. Query -based selection and integration of semantic web data sources [D] . Qasem, Abir 2009

机译：基于查询的语义Web数据源选择和集成
6. Migration from Legacy Data to XML-Experiences with Drug Information Sources at Giessen University [O] . R Schweiger, T Bürkle, AG Tafazzoli, 1999

机译：吉森大学从传统数据到具有药品信息源的XML体验的迁移
7. Efficient Top-k Search across Heterogeneous XML Data Sources [O] . 2015

机译：跨异构XmL数据源的高效Top-k搜索

Top-K data source selection for keyword queries over multiple XML data sources

摘要

著录项

相似文献

相关主题

期刊订阅