【24h】

Local Feature Selection in Text Clustering

机译:文本聚类中的局部特征选择

获取原文
获取外文期刊封面目录资料

摘要

Feature selection has improved the performance of text clustering. Global feature selection tries to identify a single subset of features which are relevant to all clusters. However, the clustering process might be improved by considering different subsets of features for locally describing each cluster. In this work, we introduce the method ZOOM-IN to perform local feature selection for partitional hierarchical clustering of text collections. The proposed method explores the diversity of clusters generated by the hierarchical algorithm, selecting a variable number of features according to the size of the clusters. Experiments were conducted on Reuters collection, by evaluating the bisecting K-means algorithm with both global and local approaches to feature selection. The results of the experiments showed an improvement in clustering performance with the use of the proposed local method.
机译:功能选择提高了文本聚类的性能。全局特征选择尝试识别与所有群集相关的单个特征子集。但是,可以通过考虑用于本地描述每个群集的要素的不同子集来改善群集过程。在这项工作中,我们介绍了ZOOM-IN方法来为文本集合的分区分层聚类执行局部特征选择。所提出的方法探索了由分层算法生成的聚类的多样性,根据聚类的大小选择了可变数量的特征。通过评估具有全局和局部特征选择方法的二等分K均值算法,对路透社的集合进行了实验。实验结果表明,使用建议的局部方法可以改善聚类性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号