Efficiency and effectiveness of query processing in cluster-based retrieval

Fazli Can; Ismail Sengoer Altingoevde; Engin Demir

首页> 外文期刊>Information Systems >Efficiency and effectiveness of query processing in cluster-based retrieval

【24h】

Efficiency and effectiveness of query processing in cluster-based retrieval

机译：基于集群的检索中查询处理的效率和有效性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Our research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based full search (FS). The proposed CBR method employs a storage structure that blends the cluster membership information into the inverted file posting lists. This approach significantly reduces the cost of similarity calculations for document ranking during query processing and improves efficiency. For example, in terms of in-memory computations, our new approach can reduce query processing time to 39% of FS. The experiments confirm that the approach is scalable and system performance improves with increasing database size. In the experiments, we use the cover coefficient-based clustering methodology (C~3M), and the Financial Times database of TREC containing 210158 documents of size 564 MB defined by 229 748 terms with total of 29 545 234 inverted index elements. This study provides CBR efficiency and effectiveness experiments using the largest corpus in an environment that employs no user interaction or user behavior assumption for clustering.

机译：我们的研究表明，对于大型数据库，没有大量额外的存储开销，基于集群的检索（CBR）可以与基于反向索引的完整搜索（FS）的时间效率和有效性相媲美。提出的CBR方法采用了一种存储结构，该结构将群集成员信息混合到倒排的文件发布列表中。这种方法显着降低了查询处理期间用于文档排名的相似度计算成本，并提高了效率。例如，就内存计算而言，我们的新方法可以将查询处理时间减少到FS的39％。实验证实该方法是可扩展的，并且系统性能随着数据库大小的增加而提高。在实验中，我们使用基于覆盖系数的聚类方法（C〜3M），以及TREC的《金融时报》数据库，其中包含210158个文档，大小为564 MB，由229748个术语定义，共有29545234个反向索引元素。这项研究在不使用用户交互或用户行为假设进行聚类的环境中，使用最大的语料库提供了CBR效率和有效性实验。

著录项

来源
《Information Systems》 |2004年第8期|p.697-717|共21页
作者
Fazli Can; Ismail Sengoer Altingoevde; Engin Demir;
展开▼
作者单位

Computer Science and Systems Analysis Department, Miami University, Oxford, OH 45056, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
clustering; cluster-based retrieval; information retrieval; performance; query processing;

机译：聚类;基于集群的检索;信息检索;性能;查询处理;
入库时间 2022-08-18 02:48:11

相似文献

外文文献
中文文献
专利

1. Cluster-based query expansion using external collections in medical information retrieval [J] . Journal of biomedical informatics. . 2015,第Null期

机译：在医疗信息检索中使用外部集合进行基于集群的查询扩展
2. An empirical study of query expansion and cluster-based retrieval in language modeling approach [J] . Na SH, Kang IS, Roh JE, Information Processing & Management . 2007,第2期

机译：语言建模方法中查询扩展和基于聚类的检索的实证研究
3. Relative Query Specification and Their Query Processing Methods in Information Retrieval [J] . Shinsuke NAKAJIMA, Katsumi TANAKA 電子情報通信学会技術研究報告. デ-タ工学. Data Engineering . 2003,第191期

机译：信息检索中的相对查询规范及其查询处理方法
4. Improving Retrieval Effectiveness for Temporal-Constrained Top-K Query Processing [C] . Hao Wu, Kuang Lu, Xiaoming Li, Asia information retrieval societies conference . 2017

机译：提高时间约束的Top-K查询处理的检索效率
5. Cluster-based Query Expansion Using Language Modeling for Biomedical Literature Retrieval. [D] . Xu, Xuheng. 2011

机译：用于生物医学文献检索的使用语言建模的基于聚类的查询扩展。
6. Research and applications: Improving image retrieval effectiveness via query expansion using MeSH hierarchical structure [O] . Mariano Crespo Azcárate, Jacinto Mata Vázquez, Manuel Maña López 2013

机译：研究与应用：通过使用MeSH层次结构的查询扩展来提高图像检索效率
7. Efficiency and effectiveness of query processing in cluster-based retrieval [O] . Fazli Can, İsmai̇l Sengör Altingövde 2015

机译：基于集群的检索中查询处理的效率和有效性

Efficiency and effectiveness of query processing in cluster-based retrieval

摘要

著录项

相似文献

相关主题

期刊订阅