An Improved Document Clustering Approach with Multi-Viewpoint Based on Different Similarity Measures

机译：一种基于不同相似度的改进的多视角文档聚类方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Electronic information such as online newspapers, journals, conference proceedings, Web sites, e-mails, etc. They are-growing very fast in extremely large amount. Using all this electronic information controlling, indexing or searching is not possible for human and for search engines also for such a huge amount of large data. Therefore, automatic document organization become a critical issue. With the help of document clustering methods, we can understand data distribution or we can preprocess data for other applications. For an instance, Search engine can produce results more effectively and efficiently if a search engine uses documents those are clustered to search an item or data. Document clustering is an automatic clustering operation and also it is a technique of an unsupervised learning. It combines related documents in one cluster and unrelated documents in different clusters so each cluster consist of documents that are related to one another within the same clusters and are unrelated to documents belonging to other cluster. For applying any clustering methods, it is necessary to calculate similarity measure. The similarity measure is used to find out the degree of closeness or degree of similarity of the target objects. In this paper, we introduce document clustering on Multiview point-based similarity measure and two related document clustering methods. The existing document clustering dissimilarity/similarity measure uses only a single viewpoint, which is the origin that means it uses only one reference point, while the ours use many different viewpoints of references.

机译：电子信息，例如在线报纸，期刊，会议记录，网站，电子邮件等。它们以非常大量的速度增长。对于人类来说，使用所有这些电子信息进行控制，索引或搜索是不可能的，对于如此大量的大数据而言，对于搜索引擎而言，这也是不可能的。因此，自动文档组织成为一个关键问题。借助文档聚类方法，我们可以了解数据分布或可以预处理其他应用程序的数据。例如，如果搜索引擎使用聚类的文档来搜索项目或数据，则搜索引擎可以更有效地产生结果。文档聚类是一种自动聚类操作，也是一种无监督学习的技术。它组合了一个群集中的相关文档和不同群集中的不相关文档，因此每个群集都由同一群集中彼此相关且与属于其他群集的文档无关的文档组成。为了应用任何聚类方法，有必要计算相似性度量。相似度度量用于找出目标对象的接近度或相似度。在本文中，我们介绍了基于多视点的相似性度量的文档聚类以及两种相关的文档聚类方法。现有的文档聚类差异/相似性度量仅使用单个视点，这是其起源，这意味着它仅使用一个参考点，而我们的文档使用了许多不同的参考视点。

著录项

来源
《International Conference on Intelligent Computing and Control Systems》|2018年|152-157|共6页
会议地点
作者
Aniali Gunta; Rahul Dubey;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Clustering algorithms; Conferences; Control systems; Clustering methods; Data mining; Euclidean distance; Length measurement;

机译：聚类算法;会议;控制系统;聚类方法;数据挖掘;欧氏距离;长度测量;

相似文献

外文文献
中文文献
专利

1. An Approach to Improve Quality of Document Clustering by Word Set Based Documenting Clustering Algorithm [J] . Sandeep Sharma, Ruchi Dave, Naveen Hemrajani Oriental journal of computer science and technology . 2011,第2期

机译：基于词集的文档聚类算法提高文档聚类质量的方法
2. A fuzzy clustering approach for finding similar documents using a novel similarity measure [J] . Ridvan Saracoglu, Kemal Tuetuencue, Novruz Allahverdi Expert systems with applications . 2007,第3期

机译：一种使用新颖的相似性度量来寻找相似文档的模糊聚类方法
3. Web Service Clustering Approach Based on Network and Fused Document-Based and Tag-Based Topics Similarity [J] . Deng Li Ping, Guo Bing, Zheng Wen International journal of web services research . 2021,第3期

机译：基于网络的Web服务聚类方法和基于标签的基于标签的主题相似性
4. An Improved Document Clustering Approach with Multi-Viewpoint Based on Different Similarity Measures [C] . Aniali Gunta, Rahul Dubey International Conference on Intelligent Computing and Control Systems . 2018

机译：基于不同相似措施的多视点改进了文档聚类方法
5. A Measure of Voxel Similarity for Improving the Image-Based Quantification of Tissue Structure and Function. [D] . Hoisak, Jeremy David Page. 2012

机译：体素相似性的一种度量，用于改进基于图像的组织结构和功能的量化。
6. Adapting Document Similarity Measures for Ligand-Based Virtual Screening [O] . Mubarak Himmat, Naomie Salim, Mohammed Mumtaz Al-Dabbagh, 2016

机译：调整文档相似性措施以进行基于配体的虚拟筛选
7. An improved co-similarity measure for document clustering [O] . Fawad Hussain, Gilles Bisson, Syed Fawad Hussain 2016

机译：用于文档聚类的改进的共同相似性度量

An Improved Document Clustering Approach with Multi-Viewpoint Based on Different Similarity Measures

摘要

著录项

相似文献

相关主题

期刊订阅