首页> 外文期刊>International journal of information retrieval research >Dynamic Data Retrieval Using Incremental Clustering and Indexing
【24h】

Dynamic Data Retrieval Using Incremental Clustering and Indexing

机译:使用增量聚类和索引的动态数据检索

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The evolution of the Internet and real-time applications has contributed to the growth of massive unstructured data which imposes the increased complexity of efficient retrieval of dynamic data. Extant research uses clustering methods and indexes to speed up the retrieval. However, the quality of clustering methods depends on data representation models where existing models suffer from dimensionality explosion and sparsity problems. As documents evolve, index reconstruction from scratch is expensive. In this work, compact vectors of documents generated by the Doc2Vec model are used to cluster the documents and the indexes are incrementally updated with less complexity using the diff method. The probabilistic ranking scheme BM25+ is used to improve the quality of retrieval for user queries. The experimental analysis demonstrates that the proposed system significantly improves the clustering performance and reduces retrieval time to obtain top-k results.
机译:互联网和实时应用的演变有助于大规模非结构化数据的增长,这强加了有效检索动态数据的复杂性。扩展研究使用群集方法和索引来加速检索。然而,聚类方法的质量取决于数据表示模型,现有模型遭受维度爆炸和稀疏问题。随着文件的发展,从头划痕的指数重建是昂贵的。在这项工作中,Doc2VEC模型生成的文件的紧凑型载体用于群集文档,并且使用Diff方法逐渐更新索引逐渐更新。概率排名方案BM25 +用于提高用户查询的检索质量。实验分析表明,所提出的系统显着提高了聚类性能并减少了检索时间以获得顶级K结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号