【24h】

Unsupervised Feature Selection for Multi-Cluster Data

机译:多集群数据的无监督特征选择

获取原文

摘要

In many data analysis tasks, one is often confronted with very high dimensional data. Feature selection techniques are designed to find the relevant feature subset of the original features which can facilitate clustering, classification and retrieval. In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. Traditional unsupervised feature selection methods address this issue by selecting the top ranked features based on certain scores computed independently for each feature. These approaches neglect the possible correlation between different features and thus can not produce an optimal feature subset. Inspired from the recent developments on manifold learning and Li-regularized models for subset selection, we propose in this paper a new approach, called Multi-Clitster Feature Selection (MCFS), for unsupervised feature selection. Specifically, we select those features such that the multi-cluster structure of the data can be best preserved. The corresponding optimization problem can be efficiently solved since it only involves a sparse eigen-problem and a Ll-regularized least squares problem. Extensive experimental results over various real-life data sets have demonstrated the superiority of the proposed algorithm.
机译:在许多数据分析任务中,通常会遇到非常高维度的数据。特征选择技术旨在查找原始特征的相关特征子集,从而有助于聚类,分类和检索。在本文中,我们考虑了无监督学习场景中的特征选择问题,由于没有可指导相关信息搜索的类标签,因此该问题特别困难。特征选择问题本质上是组合优化问题,该组合优化问题在计算上是昂贵的。传统的无监督特征选择方法通过基于为每个特征独立计算的某些分数来选择排名最高的特征,从而解决了这一问题。这些方法忽略了不同特征之间的可能相关性,因此无法产生最佳特征子集。受到流形学习和Li正规化模型用于子集选择的最新发展的启发,我们在本文中提出了一种新的方法,称为多聚类特征选择(MCFS),用于无监督特征选择。具体来说,我们选择这些功能,以便可以最好地保留数据的多集群结构。由于仅涉及稀疏本征问题和L1正则化最小二乘问题,因此可以有效地解决相应的优化问题。在各种实际数据集上的大量实验结果证明了该算法的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号