Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

Shah Syed; Amjad Mohammad

首页> 外文期刊>The international arab journal of information technology >Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

【24h】

Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

机译：基于Graph挖掘的上述文档聚类基于最大频繁启动保存

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents an approach to cluster documents. It introduces a novel graph mining based algorithm to find frequent termsets present in a document set. The document set is initially mapped onto a bipartite graph. Based on the results of our algorithm, the document set is modified to reduce its dimensionality. Then, Bisecting K-means algorithm is executed over the modified document set to obtain a set of very meaningful clusters. It has been shown that the proposed approach, Clustering preceded by Graph Mining based Maximal Frequent Termsets Preservation (CGFTP), produces better quality clusters than produced by some classical document clustering algorithm(s). It has also been shown that the produced clusters are easily interpretable. The quality of clusters has been measured in terms of their F-measure.

机译：本文介绍了群集文件的方法。它介绍了一种基于新的图形挖掘算法，可以在文档集中找到频繁的文本。文档集最初映射到二分钟图。根据我们算法的结果，修改了文档集以减少其维度。然后，在修改的文档集上执行分数k-means算法以获得一组非常有意义的群集。已经表明，所提出的方法，基于Graph Cliping的最大频繁启动保存（CGFTP）之前的群集，产生了比由某种经典文档聚类算法产生的更好的质量簇。还显示出生产的簇很容易解释。在其F测量方面衡量了群集质量。

著录项

来源
《The international arab journal of information technology》 |2019年第3期|364-370|共7页
作者
Shah Syed; Amjad Mohammad;
展开▼
作者单位

Jamia Millia Islamia Dept Comp Engn New Delhi India;

Jamia Millia Islamia Dept Comp Engn New Delhi India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Bipartite graph; graph mining; frequent termsets mining; bisecting K-means;

机译：二角形图;图形挖掘;频繁的挖掘;分别为K-means;

相似文献

外文文献
中文文献
专利

1. Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation [J] . Shah Syed, Amjad Mohammad The international arab journal of information technology . 2019,第3期

机译：通过基于图挖掘的最大频繁项保留来进行文档聚类
2. Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation [J] . Shah Syed, Amjad Mohammad Annals of the American Thoracic Society . 2019,第3期

机译：基于Graph挖掘的先前文档聚类基于最大频繁启动保存
3. Maximal Frequent Term Based Document Clustering [J] . Harsha Patil, Ramjeevan Singh Thakur International Journal of Applied Engineering Research . 2017,第22aPta4期

机译：基于最大频繁的文档群集
4. Document Clustering Based on Maximal Frequent Sequences [C] . Edith Hernandez-Reyes, Rene A. Garcia-Hernandez, J.A. Carrasco-Ochoa, International Conference on Advances in Natural Language Processing(NLP, FinTAL2006); 20060823-25; Turku(FI) . 2006

机译：基于最大频繁序列的文档聚类
5. Aspect-based opinion mining of product reviews in microblogs using most relevant frequent clusters of terms. [D] . Ejieh, Chukwuma. 2016

机译：使用最相关的频繁术语集群在微博中基于方面的产品评论意见挖掘。
6. RASMA: a reverse search algorithm for mining maximal frequent subgraphs [O] . Saeed Salem, Mohammed Alokshiya, Mohammad Al Hasan 2021

机译：RASMA：用于采矿最大频繁子图的反向搜索算法
7. Multi-objective Frequent Termset Clustering [O] . 2008

机译：多目标频繁项集聚

Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

摘要

著录项

相似文献

相关主题

期刊订阅