首页> 外文期刊>International Journal of Intelligent Enterprise >Clustering of text documents with keyword weighting function
【24h】

Clustering of text documents with keyword weighting function

机译:群集文本文档与关键字加权函数

获取原文
获取原文并翻译 | 示例
       

摘要

In this digital world, data is available in abundance everywhere and it is growing at a phenomenal rate. Making data available readily for decision making is an important task of data analyst. In this article, we propose an unsupervised learning algorithm for text document clustering by adopting keyword weighting function. Documents are pre-processed and relevant keywords based on their weights are grouped together. Clustered keyword weighting (CKW) takes each class in the training collection as a known cluster, and searches for feature weights iteratively to optimise the clustering objective function, in order to retrieve the best clustering result. Performance of CKW is validated by clustering BBC news collection text collections. Experiments were conducted with simple K-means, hierarchical clustering algorithms and our keyword weighting and clustering approach has shown improved cluster quality compared to the other methods.
机译:在这一数字世界中,数据无处不在的数据有丰富,它以惊人的速度增长。 可以随时为决策制定数据是数据分析师的重要任务。 在本文中,我们通过采用关键字加权函数提出了一种无监督的学习算法,用于文本文档群集。 文档是预处理的,并且基于其权重的相关关键字被分组在一起。 群集关键字加权(CKW)将训练收集中的每个类作为已知的群集,并迭代地搜索特征权重,以优化群集目标函数,以便检索最佳的聚类结果。 通过群集BBC新闻集合文本集合验证CKW的性能。 使用简单的K-means进行实验,分层聚类算法和我们的关键字加权和聚类方法显示了与其他方法相比的群集质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号