A fuzzy clustering approach for finding similar documents using a novel similarity measure

Ridvan Saracoglu; Kemal Tuetuencue; Novruz Allahverdi

首页> 外文期刊>Expert systems with applications >A fuzzy clustering approach for finding similar documents using a novel similarity measure

【24h】

A fuzzy clustering approach for finding similar documents using a novel similarity measure

机译：一种使用新颖的相似性度量来寻找相似文档的模糊聚类方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Searching for similar documents has a crucial role in document management. This paper aims for developing a fast and high quality method of searching similar documents based on fuzzy clustering in large document collections. In order to perform these requirements, a two layers structure is proposed. Formerly, finding the similarity in documents is based on the strategy that uses word-by-word comparison. The proposed method in this study uses two layers structure and lets the documents pass through it to find the similarities. In this system, predefined fuzzy clusters are used to extract feature vectors of related documents for finding similar documents of them. Similarity measure is estimated based on these vectors. To do this, a distance based similarity measure is proposed. It has been seen in empirical results that the proposed system uses new similarity measure and has better performance compared with conventional similarity measurement systems.

机译：搜索相似文档在文档管理中至关重要。本文旨在开发一种快速，高质量的基于模糊聚类的大型文档集合搜索相似文档的方法。为了执行这些要求，提出了两层结构。以前，查找文档中的相似性是基于使用逐字比较的策略。本研究中提出的方法使用两层结构，让文档通过它来查找相似之处。在该系统中，使用预定义的模糊聚类来提取相关文档的特征向量，以查找它们的相似文档。基于这些向量估计相似性度量。为此，提出了基于距离的相似性度量。从经验结果可以看出，与传统的相似度测量系统相比，该系统使用了新的相似度测量，并且具有更好的性能。

著录项

来源
《Expert systems with applications 》 |2007年第3期| p.600-605| 共6页
作者
Ridvan Saracoglu; Kemal Tuetuencue; Novruz Allahverdi;
展开▼
作者单位

Department of Electronic and Computer Education, Selcuk University, 42031 Konya, Turkey;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论 ;
关键词
text mining; document similarity; fuzzy clustering; fuzzy similarity measure; distance based similarity;

机译：文本挖掘;文档相似度;模糊聚类;模糊相似度;基于距离的相似度;

相似文献

外文文献
中文文献
专利

1. Cross-Lingual Document Representation and Semantic Similarity Measure: A Fuzzy Set and Rough Set Based Approach [J] . Huang H-.H., Kuo Y-.H. Fuzzy Systems, IEEE Transactions on . 2010 ,第6期

机译：跨语言文档表示和语义相似性度量：基于模糊集和粗糙集的方法
2. A new approach to fuzzy distance measure and similarity measure between two generalized fuzzy numbers [J] . Debashree Guha, Debjani Chakraborty Applied Soft Computing . 2010 ,第1期

机译：两个广义模糊数之间的模糊距离测度和相似度测度的新方法
3. Semantic Similarity-Based Clustering of Web Documents Using Fuzzy C-Means [J] . J. Avanija, K. Ramar International Journal of Computational Intelligence and Applications . 2015 ,第3期

机译：基于语义相似度的Web文档模糊C均值聚类
4. A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering [C] . Arash Heidarian, Michael J. Dinneen IEEE International Conference on Big Data Computing Service and Applications . 2016

机译：测量文档之间相似度和文档聚类的混合几何方法
5. A comparison of clustering procedures and similarity measures in creating clusters using warp functions. [D] . Elguindi, Anne Charlotte. 2010

机译：使用warp函数创建聚类时聚类过程和相似性度量的比较。
6. A New Validity Index Based on Fuzzy Energy and Fuzzy Entropy Measures in Fuzzy Clustering Problems [O] . Ferdinando Di Martino, Salvatore Sessa 2020

机译：基于模糊能量和模糊熵措施在模糊聚类问题中的新有效性指数
7. FUSE (Fuzzy Similarity Measure) - A measure for determining fuzzy short text similarity using Interval Type-2 fuzzy sets [O] . Naeemeh Adel, Keeley Crockett, Alan Crispin, 2018

机译：熔断器（模糊相似度测量） - 使用间隔类型-2模糊集确定模糊短文本相似度的度量

A fuzzy clustering approach for finding similar documents using a novel similarity measure

摘要

著录项

相似文献

相关主题

期刊订阅