首页> 外文会议>2017 1st International Conference on Intelligent Systems and Information Management >Hierarchical document clustering based on cosine similarity measure

【24h】

Hierarchical document clustering based on cosine similarity measure

机译：基于余弦相似度度量的分层文档聚类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly text documents is introduced. It also provides an algorithm which has an incremental approach and evaluates cluster likeness between documents that leads to much improved results over other traditional methods. It also focuses on the selection of appropriate similarity measure for analyzing similarity between the documents.

机译：群集是数据挖掘中的主要主题之一。群集将数据分区，并将数据分类为有意义的子组。文档聚类是将文档分为几组，以便两组在相似度方面显示出不同的特征。本文介绍了一种基于相似度的实验探索方法——HSC，用于测量数据对象（尤其是文本文档）之间的相似度。它还提供了一种算法，该算法具有增量方法，并且可以评估文档之间的聚类相似度，从而使结果比其他传统方法好得多。它还着重于选择适当的相似性度量来分析文档之间的相似性。

著录项

来源
《2017 1st International Conference on Intelligent Systems and Information Management 》|2017年|153-159|共7页
会议地点 Aurangabad(IN)
作者
Shraddha K. Popat; Pramod B. Deshmukh; Vishakha A. Metre;
展开▼
作者单位

D.Y. Patil College of Engineering, Akurdi, Pune, MH, India;

D.Y. Patil College of Engineering, Akurdi, Pune, MH, India;

D.Y. Patil College of Engineering, Akurdi, Pune, MH, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering algorithms; Weight measurement; Algorithm design and analysis; Classification algorithms; Size measurement; Text categorization;

机译：聚类算法权重测量算法设计与分析分类算法尺寸测量文本分类;

相似文献

外文文献
中文文献
专利

1. Term Frequency Based Cosine Similarity Measure for Clustering Categorical Data using Hierarchical Algorithm [J] . S. Anitha Elavarasi, J. Akilandeswari Research journal of applied science, engineering and technology . 2015 ,第7期

机译：基于术语频率的余弦相似度度量用于分类数据聚类
2. Term Frequency Based Cosine Similarity Measure for Clustering Categorical Data using Hierarchical Algorithm [J] . S. Anitha Elavarasi, J. Akilandeswari Research journal of applied science, engineering and technology . 2015 ,第7期

机译：基于层次词算法的分类数据聚类的基于词频的余弦相似度度量
3. OCCURRENCE BASED CATEGORICAL DATA CLUSTERING USING COSINE AND BINARY MATCHING SIMILARITY MEASURE [J] . S. ANITHA ELAVARASI, J. AKILANDESWARI Journal of Theoretical and Applied Information Technology . 2014 ,第1期

机译：使用余弦和二进制匹配相似度量的发生基于分类数据聚类
4. Hierarchical document clustering based on cosine similarity measure [C] . Shraddha K. Popat, Pramod B. Deshmukh, Vishakha A. Metre International Conference on Intelligent Systems and Information Management . 2017

机译：基于余弦相似度量的分层文档聚类
5. Text document topical recursive clustering and automatic labeling of a hierarchy of document clusters. [D] . Li, Xiaoxiao. 2012

机译：文本文档主题递归群集和文档群集层次结构的自动标记。
6. Similarity analysis between chromosomes of Homo sapiens and monkeys with correlation coefficient rank correlation coefficient and cosine similarity measures [O] . Chinta Someswara Rao, S. Viswanadha Raju 2016

机译：利用相关系数秩相关系数和余弦相似性度量分析智人与猴子染色体之间的相似性
7. A comparative study of ontology based term similarity measures on PubMed document clustering [O] . Xiaodan Zhang, Liping Jing, Xiaohua Hu, 2015

机译：pubmed文档聚类中基于本体的术语相似性度量的比较研究

Hierarchical document clustering based on cosine similarity measure

摘要

著录项

相似文献

相关主题

期刊订阅