...
首页> 外文期刊>The international arab journal of information technology >Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation
【24h】

Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

机译:基于Graph挖掘的上述文档聚类基于最大频繁启动保存

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents an approach to cluster documents. It introduces a novel graph mining based algorithm to find frequent termsets present in a document set. The document set is initially mapped onto a bipartite graph. Based on the results of our algorithm, the document set is modified to reduce its dimensionality. Then, Bisecting K-means algorithm is executed over the modified document set to obtain a set of very meaningful clusters. It has been shown that the proposed approach, Clustering preceded by Graph Mining based Maximal Frequent Termsets Preservation (CGFTP), produces better quality clusters than produced by some classical document clustering algorithm(s). It has also been shown that the produced clusters are easily interpretable. The quality of clusters has been measured in terms of their F-measure.
机译:本文介绍了群集文件的方法。它介绍了一种基于新的图形挖掘算法,可以在文档集中找到频繁的文本。文档集最初映射到二分钟图。根据我们算法的结果,修改了文档集以减少其维度。然后,在修改的文档集上执行分数k-means算法以获得一组非常有意义的群集。已经表明,所提出的方法,基于Graph Cliping的最大频繁启动保存(CGFTP)之前的群集,产生了比由某种经典文档聚类算法产生的更好的质量簇。还显示出生产的簇很容易解释。在其F测量方面衡量了群集质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号