首页> 外文OA文献 >Comparison Jaccard similarity, Cosine Similarity and Combined Both of the Data Clustering With Shared Nearest Neighbor Method
【2h】

Comparison Jaccard similarity, Cosine Similarity and Combined Both of the Data Clustering With Shared Nearest Neighbor Method

机译:比较Jaccard相似度,余弦相似性以及数据聚类与共享最近邻法的结合

摘要

Text Mining is the excavations carried out by the computer to get something new that comes from information extracted automatically from data sources of different text. Clustering technique itself is a grouping technique that is widely used in data mining. The aim of this study was to find the most optimum value similarity. Jaccard similarity method used similarity, cosine similarity and a combination of Jaccard similarity and cosine similarity. By combining the two similarity is expected to increase the value of the similarity of the two titles. While the document is used only in the form of a title document of practical work in the Department of Informatics Engineering University of Ahmad Dahlan. All these articles have been through the process of preprocessing beforehand. And the method used is the method of document clustering with Shared Nearest Neighbor (SNN). Results from this study is the cosine similarity method gives the best value of proximity or similarity compared to Jaccard similarity and a combination of both
机译:文本挖掘是计算机进行的挖掘工作,目的是从自动从不同文本数据源中提取的信息中获取新的东西。聚类技术本身是一种在数据挖掘中广泛使用的分组技术。这项研究的目的是找到最佳的值相似性。 Jaccard相似度方法使用了相似度,余弦相似度以及Jaccard相似度和余弦相似度的组合。通过将两个相似度结合起来,有望增加两个标题的相似度值。该文档仅以Ahmad Dahlan大学信息工程系的实际工作标题文档的形式使用。所有这些文章都已经通过了预处理过程。使用的方法是使用共享最近邻(SNN)进行文档聚类的方法。这项研究的结果是,与Jaccard相似度和两者的组合相比,余弦相似度方法具有最佳的接近度或相似度值

著录项

  • 作者

    Zahrotun Lisna;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号