首页> 外文期刊>Modern Physics Letters, B. Condensed Matter Physics, Statistical Physics, Applied Physics >Efficient text feature extraction by integrating the average linkage and K-medoids clustering
【24h】

Efficient text feature extraction by integrating the average linkage and K-medoids clustering

机译:通过整合平均联系和k-yemoids聚类来高效的文本特征提取

获取原文
获取原文并翻译 | 示例
           

摘要

By clustering feature words, we can not only simplify the dimension of feature subsets, but also eliminate the redundancy of the feature. However, for a feature set with very large dimensions, the traditional K-medoids algorithm is difficult to accurately estimate the value of k. Moreover, the clustering results of the average linkage (AL) algorithm cannot be divided again, and the AL algorithm cannot be directly used for text classification. In order to overcome the limitations of AL and K-medoids, in this paper, we combine the two algorithms together so as to be mutually complementary to each other. In particular, in order to meet the purpose of text classification, we improve the AL algorithm and propose the R-2 testing statistics to obtain the approximate number of clusters. Finally, the central feature words are preserved, and the other feature words are deleted. The experimental results show that the new algorithm largely eliminates the redundancy of the feature. Compared with the traditional TF-IDF algorithms, the performance of the text classification of the new algorithm is improved.
机译:None

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号