首页> 外文期刊>Journal of software >Research of Feature Selection for Text Clustering Based on Cloud Model
【24h】

Research of Feature Selection for Text Clustering Based on Cloud Model

机译:基于云模型的文本聚类特征选择研究

获取原文
获取原文并翻译 | 示例
           

摘要

Text clustering belongs to the unsupervised machine learning, the discriminability of class attributes cannot be measured in clustering. And the traditional text feature selection methods cannot effectively solve the high-dimensional problem. To overcome the weakness in existing feature selection, this paper proposes a new method which introduces the cloud model theory into feature selection, constructs the clouds filter for clustering documents. The distribution of document words is constructed in a microcosmic level. By employing the cloud model digital characteristics we can better compute the separability between feature words. Experimental results with Κ-means algorithm show that our method can remarkably improve the accuracy of text clustering.
机译:文本聚类属于无监督机器学习,无法在聚类中测量类属性的可分辨性。传统的文本特征选择方法不能有效解决高维问题。为了克服现有特征选择的不足,提出了一种新方法,将云模型理论引入特征选择,构造了用于文档聚类的云过滤器。文档单词的分布是微观的。通过使用云模型的数字特征,我们可以更好地计算特征词之间的可分离性。 K均值算法的实验结果表明,该方法可以显着提高文本聚类的准确性。

著录项

  • 来源
    《Journal of software》 |2013年第12期|3246-3252|共7页
  • 作者

    Junmin Zhao; Kai Zhang; Jian Wan;

  • 作者单位

    Henan University of Urban Construction/Institute of Computer Science and Engineering, Pingdingshan, China;

    Henan University of Urban Construction/Institute of Computer Science and Engineering, Pingdingshan, China;

    ZhengZhou ShiYi Technology Co. Ltd, Zhengzhou, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    feature selection; cloud model; TF-IDF; Κ-means algorithm;

    机译:特征选择;云模型TF-IDF;Κ-均值算法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号