...
首页> 外文期刊>Expert systems with applications >A fuzzy clustering approach for finding similar documents using a novel similarity measure
【24h】

A fuzzy clustering approach for finding similar documents using a novel similarity measure

机译:一种使用新颖的相似性度量来寻找相似文档的模糊聚类方法

获取原文
获取原文并翻译 | 示例

摘要

Searching for similar documents has a crucial role in document management. This paper aims for developing a fast and high quality method of searching similar documents based on fuzzy clustering in large document collections. In order to perform these requirements, a two layers structure is proposed. Formerly, finding the similarity in documents is based on the strategy that uses word-by-word comparison. The proposed method in this study uses two layers structure and lets the documents pass through it to find the similarities. In this system, predefined fuzzy clusters are used to extract feature vectors of related documents for finding similar documents of them. Similarity measure is estimated based on these vectors. To do this, a distance based similarity measure is proposed. It has been seen in empirical results that the proposed system uses new similarity measure and has better performance compared with conventional similarity measurement systems.
机译:搜索相似文档在文档管理中至关重要。本文旨在开发一种快速,高质量的基于模糊聚类的大型文档集合搜索相似文档的方法。为了执行这些要求,提出了两层结构。以前,查找文档中的相似性是基于使用逐字比较的策略。本研究中提出的方法使用两层结构,让文档通过它来查找相似之处。在该系统中,使用预定义的模糊聚类来提取相关文档的特征向量,以查找它们的相似文档。基于这些向量估计相似性度量。为此,提出了基于距离的相似性度量。从经验结果可以看出,与传统的相似度测量系统相比,该系统使用了新的相似度测量,并且具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号