...
首页> 外文期刊>Modern Applied Science >A New Method of Hierarchical Text Clustering Based on Lsa-Hgsom
【24h】

A New Method of Hierarchical Text Clustering Based on Lsa-Hgsom

机译:基于Lsa-Hgsom的分层文本聚类的新方法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Text clustering has been recognized as an important component in data mining. Self-Organizing Map (SOM) based models have been found to have certain advantages for clustering sizeable text data. However, current existing approaches lack in providing an adaptive hierarchical structure within in a single model. This paper presents a new method of hierarchical text clustering based on combination of latent semantic analysis (LSA) and hierarchical GSOM, which is called LSA-HGSOM method. The text clustering result using traditional methods can not show hierarchical structure. However, the hierarchical structure is very important in text clustering. The LSA-HGSOM method can automatically achieve hierarchical text clustering, and establishes vector space model (VSM) of term weight by using the theory of LSA, then semantic relation is included in the vector space model. Both theory analysis and experimental results confirm that LSA-HGSOM method decreases the number of vector, and enhances the efficiency and precision of text clustering.
机译:文本聚类已被认为是数据挖掘中的重要组成部分。已经发现基于自组织映射(SOM)的模型对于聚类较大的文本数据具有某些优势。但是,当前的现有方法缺乏在单个模型内提供自适应分层结构的能力。本文提出了一种基于潜在语义分析(LSA)和层次化GSOM的分层文本聚类新方法,称为LSA-HGSOM方法。使用传统方法的文本聚类结果无法显示层次结构。但是,层次结构在文本聚类中非常重要。 LSA-HGSOM方法可以自动实现分层文本聚类,并利用LSA理论建立术语权重的向量空间模型(VSM),然后在向量空间模型中包含语义关系。理论分析和实验结果均证实,LSA-HGSOM方法减少了向量数量,提高了文本聚类的效率和精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号