首页> 外文会议>2010 1st International Conference on Parallel Distributed and Grid Computing >Text document clustering based on frequent concepts

【24h】

Text document clustering based on frequent concepts

机译：基于频繁概念的文本文档聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a novel technique of document clustering based on frequent concepts. The proposed FCDC (Frequent Concepts based Document Clustering), a clustering algorithm works with frequent concepts rather than frequent itemsets used in traditional text mining techniques. Many well known clustering algorithms deal with documents as bag of words while they ignore the important relationship between words like synonym relationship. The proposed algorithm utilizes the semantic relationship between words to create concepts. It exploits the WordNet ontology in turn to create low dimensional feature vector which allows developing a more accurate clustering algorithm.

机译：本文提出了一种基于频繁概念的文档聚类新技术。提出的FCDC（基于频繁概念的文档聚类）是一种聚类算法，适用于频繁概念而不是传统文本挖掘技术中使用的频繁项集。许多众所周知的聚类算法将文档视为单词袋，而忽略了单词之间的重要关系，例如同义词关系。所提出的算法利用单词之间的语义关系来创建概念。它依次利用WordNet本体来创建低维特征向量，从而可以开发更准确的聚类算法。

著录项

来源
《2010 1st International Conference on Parallel Distributed and Grid Computing 》|2010年|p.366-371|共6页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类网络计算机（NC） ; 计算机网络 ;
关键词

相似文献

外文文献
中文文献
专利

1. A SOM-Based Document Clustering Using Frequent Max Substrings for Non-Segmented Texts [J] . Todsanai Chumwatana, Kok Wai Wong, Hong Xie Journal of Intelligent Learning Systems and Applications . 2010 ,第3期

机译：基于SOM的文档聚类，使用非分类文本的最大行数子字符串
2. Text document clustering based on frequent word meaning sequences [J] . Yanjun Li, Soon M. Chung, John D. Holt Data & Knowledge Engineering . 2008 ,第1期

机译：基于频繁词义序列的文本文档聚类
3. DIC-DOC-K-means: Dissimilarity-based Initial Centroid selection for DOCument clustering using K-means for improving the effectiveness of text document clustering [J] . Lakshmi R., Baskar S. Journal of Information Science . 2019 ,第6期

机译：DIC-DOC-K-means：使用K-means的DOCument聚类基于不相似性的初始质心选择，以提高文本文档聚类的效率
4. Text document clustering based on frequent concepts [C] . {missing} International Conference on Parallel Distributed and Grid Computing . 2010

机译：基于频繁概念的文本文档群集
5. Frequent item-based text clustering. [D] . Afshar, Homayoun. 2003

机译：基于项目的频繁文本聚类。
6. Thematic clustering of text documents using an EM-based approach [O] . Sun Kim, W John Wilbur 2012

机译：使用基于EM的方法对文本文档进行主题聚类
7. Frequent Itemset-based Text Clustering Approach to Cluster Ranked Documents [O] . Snehalata Nandanwar, Geetanjali Kale, Sheetal Sonawane 2014

机译：基于项目集的基于项目的文本聚类方法来群集排名文档

Text document clustering based on frequent concepts

摘要

著录项

相似文献

相关主题

期刊订阅