Fast and effective cluster-based information retrieval using frequent closed itemsets

Djenouri Youcef; Belhadi Asma; Fournier-Viger Philippe; Lin Jerry Chun-Wei

首页> 外文期刊>Information Sciences: An International Journal >Fast and effective cluster-based information retrieval using frequent closed itemsets

【24h】

Fast and effective cluster-based information retrieval using frequent closed itemsets

机译：基于频繁封闭项目的基于基于群集的信息的快速有效的基于群集的信息

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document Information retrieval consists of finding the documents in a collection of documents that are the most relevant to a user query. Information retrieval techniques are widely-used by organizations to facilitate the search for information. However, applying traditional information retrieval techniques is time consuming for large document collections. Recently, cluster-based information retrieval approaches have been developed. Although these approaches are often much faster than traditional approaches for processing large document collections, the quality of the documents retrieved by cluster-based approaches is often less than that of traditional approaches. To address this drawback of i cluster-based approaches, and improve the performance of information retrieval both in terms of runtime and quality of retrieved documents, this paper proposes a new cluster based information retrieval approach named ICIR (Intelligent Cluster-based Information Retrieval). The proposed approach combines k-means clustering with frequent closed itemset mining to extract clusters of documents and find frequent terms in each cluster. Patterns discovered in each cluster are then used to select the most relevant document clusters to answer each user query. Four alternative heuristics are proposed to select the most relevant clusters, and two alternative heuristics for choosing documents in the selected clusters. Thus, eight versions of the proposed approach are obtained. To validate the proposed approach, extensive experiments have been carried out on well-known document collections. Results show that the designed approach outperforms traditional and cluster-based information retrieval approaches both in terms of execution time and quality of the returned documents. (C) 2018 Elsevier Inc. All rights reserved.

机译：文档信息检索包括在与用户查询最相关的文档集中找到文档。组织广泛使用信息检索技术，以便于搜索信息。但是，应用传统信息检索技术是大型文档集合的耗时。最近，已经开发了基于群集的信息检索方法。虽然这些方法通常比传统的处理大文件收集方法更快，但是由基于群集的方法检索的文档的质量往往小于传统方法的文档。为了解决基于群集的方法的这种缺点，并提高信息检索的性能，以及检索到的检索文档的质量，提出了一种名为ICIR的基于集群的信息检索方法（基于智能群集的信息检索）。所提出的方法将K-Means群集与频繁关闭的项目集挖掘组合以提取文档集群并在每个群集中找到频繁的术语。然后，在每个群集中发现的模式用于选择最相关的文档群集以应对每个用户查询进行应答。建议四种替代启发式学习选择最相关的群集，以及用于在所选集群中选择文档的两个替代启发式。因此，获得了八种版本的提出方法。为了验证所提出的方法，已经在众所周知的文件收集中进行了广泛的实验。结果表明，设计的方法在返回的文件的执行时间和质量方面优于传统和基于群集的信息检索方法。（c）2018年Elsevier Inc.保留所有权利。

著录项

来源
《Information Sciences: An International Journal》 |2018年第2018期|共14页
作者
Djenouri Youcef; Belhadi Asma; Fournier-Viger Philippe; Lin Jerry Chun-Wei;
展开▼
作者单位

Southern Denmark Univ IMADA Odense Denmark;

USTHB RIMA Lab Algiers Algeria;

Harbin Inst Technol Sch Humanities &

Social Sci Shenzhen Peoples R China;

Western Norway Univ Appl Sci HVL Dept Comp Math &

Phys Bergen Norway;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;计算机的应用;信息与知识传播;自动化技术、计算机技术;
关键词
Document information retrieval; Data mining; Big collections; Cluster-based approaches; Frequent itemset mining;

机译：文档信息检索;数据挖掘;大集合;基于集群的方法;频繁的项目集挖掘;

相似文献

外文文献
中文文献
专利

1. Fast and effective cluster-based information retrieval using frequent closed itemsets [J] . Djenouri Youcef, Belhadi Asma, Fournier-Viger Philippe, Information Sciences: An International Journal . 2018,第期

机译：基于频繁封闭项目的基于基于群集的信息的快速有效的基于群集的信息
2. An effective hashtable-based approach for incrementally mining closed frequent itemsets using sliding windows [J] . M. Jeya Sutha, F. Ramesh Dhanaseelan International journal of data mining, modelling and management . 2016,第4期

机译：一种基于哈希表的有效方法，可使用滑动窗口增量挖掘封闭的频繁项集
3. DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets [J] . Bay Vo, Tzung-Pei Hong, Bac Le Expert systems with applications . 2012,第8期

机译：DBV-Miner：一种动态位向量方法，用于快速挖掘频繁关闭的项目集
4. CloseMiner: discovering frequent closed itemsets using frequent closed tidsets [C] . Singh, N.G., Singh, . 2005

机译：CloseMiner：使用频繁关闭的提示集发现频繁关闭的项目集
5. Frequent Itemset Hiding Algorithm Using Frequent Pattern Tree Approach. [D] . Alnatsheh, Rami. 2012

机译：使用频繁模式树方法的频繁项集隐藏算法。
6. Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data [O] . András Király, Attila Gyenesei, János Abonyi -1

机译：高位二进制数据中基于位表的聚类和频繁封闭项集挖掘
7. Fast and effective cluster-based information retrieval using frequent closed itemsets [O] . Youcef Djenouri, Asma Belhadi, Philippe Fournier-Viger, 2018

机译：使用频繁关闭的项目集来检索快速有效的基于群集的信息

Fast and effective cluster-based information retrieval using frequent closed itemsets

摘要

著录项

相似文献

相关主题

期刊订阅