首页> 外文会议>2017 1st International Conference on Intelligent Systems and Information Management >Hierarchical document clustering based on cosine similarity measure
【24h】

Hierarchical document clustering based on cosine similarity measure

机译:基于余弦相似度度量的分层文档聚类

获取原文
获取原文并翻译 | 示例

摘要

Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly text documents is introduced. It also provides an algorithm which has an incremental approach and evaluates cluster likeness between documents that leads to much improved results over other traditional methods. It also focuses on the selection of appropriate similarity measure for analyzing similarity between the documents.
机译:群集是数据挖掘中的主要主题之一。群集将数据分区,并将数据分类为有意义的子组。文档聚类是将文档分为几组,以便两组在相似度方面显示出不同的特征。本文介绍了一种基于相似度的实验探索方法——HSC,用于测量数据对象(尤其是文本文档)之间的相似度。它还提供了一种算法,该算法具有增量方法,并且可以评估文档之间的聚类相似度,从而使结果比其他传统方法好得多。它还着重于选择适当的相似性度量来分析文档之间的相似性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号