Dynamic Data Retrieval Using Incremental Clustering and Indexing

Uma Priya D; Santhi Thilagam P

首页> 外文期刊>International journal of information retrieval research >Dynamic Data Retrieval Using Incremental Clustering and Indexing

【24h】

Dynamic Data Retrieval Using Incremental Clustering and Indexing

机译：使用增量聚类和索引的动态数据检索

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The evolution of the Internet and real-time applications has contributed to the growth of massive unstructured data which imposes the increased complexity of efficient retrieval of dynamic data. Extant research uses clustering methods and indexes to speed up the retrieval. However, the quality of clustering methods depends on data representation models where existing models suffer from dimensionality explosion and sparsity problems. As documents evolve, index reconstruction from scratch is expensive. In this work, compact vectors of documents generated by the Doc2Vec model are used to cluster the documents and the indexes are incrementally updated with less complexity using the diff method. The probabilistic ranking scheme BM25+ is used to improve the quality of retrieval for user queries. The experimental analysis demonstrates that the proposed system significantly improves the clustering performance and reduces retrieval time to obtain top-k results.

机译：互联网和实时应用的演变有助于大规模非结构化数据的增长，这强加了有效检索动态数据的复杂性。扩展研究使用群集方法和索引来加速检索。然而，聚类方法的质量取决于数据表示模型，现有模型遭受维度爆炸和稀疏问题。随着文件的发展，从头划痕的指数重建是昂贵的。在这项工作中，Doc2VEC模型生成的文件的紧凑型载体用于群集文档，并且使用Diff方法逐渐更新索引逐渐更新。概率排名方案BM25 +用于提高用户查询的检索质量。实验分析表明，所提出的系统显着提高了聚类性能并减少了检索时间以获得顶级K结果。

著录项

来源
《International journal of information retrieval research》 |2020年第3期|74-91|共18页
作者
Uma Priya D; Santhi Thilagam P;
展开▼
作者单位

Department of Computer Science and Engineering National Institute of Technology Karnataka Surathkal;

Department of Computer Science and Engineering National Institute of Technology Karnataka Surathkal;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Incremental Clustering; Inverted Indexing; Information Retrieval; Ranked Retrieval; Representation Learning;

机译：增量聚类;倒置索引;信息检索;排名检索;代表学习;

相似文献

外文文献
中文文献
专利

1. Spectral Clustering and Vantage Point Indexing For Efficient Data Retrieval [J] . Pushpalatha R., K. Meenakshi Sundaram International Journal of Electrical and Computer Engineering . 2018,第4期

机译：频谱聚类和Vantage点索引可实现高效数据检索
2. Unsupervised and Interactive Semi-supervised Clustering for Large Image Database Indexing and Retrieval [J] . Hien Phuong Lai, Muriel Visani, Alain Boucher, Fundamenta Informaticae . 2014,第2期

机译：用于大型图像数据库索引和检索的无监督和交互式半监督聚类
3. CD-Tree: A clustering-based dynamic indexing and retrieval approach [J] . Wan Yuchai, Liu Xiabi, Wu Yi Intelligent data analysis . 2017,第2期

机译：CD-Tree：一种基于聚类的动态索引和检索方法
4. Incremental on-line semantic indexing for image retrieval in dynamic databases [C] . IEEE Conference on Computer Vision and Pattern Recognition . 2009

机译：动态数据库中图像检索的增量在线语义索引
5. Efficient declustering and indexing techniques for temporal databases and information retrieval. [D] . Behl, Sanjiv. 2002

机译：用于时态数据库和信息检索的高效解聚和索引技术。
6. Clinical Data Element Ontology for Unified Indexing and Retrieval of Data Elements across Multiple Metadata Registries [O] . Senator Jeong, Hye Hyeon Kim, Yu Rang Park, 2014

机译：临床数据元素本体用于跨多个元数据注册表统一索引和检索数据元素
7. Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval [O] . Pushpalatha R., K. Meenakshi Sundaram 2018

机译：高效数据检索的光谱聚类和Vantage点索引
8. Incremental Model-Based Clustering for Large Datasets With Small Clusters [R] . Fraley, C. , Raftery, A. , Wehrensy, R. 2003

机译：基于增量模型的聚类适用于具有小集群的大型数据集

Dynamic Data Retrieval Using Incremental Clustering and Indexing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅