首页> 外文期刊>International Journal of Computer Mathematics: Computer Systems Theory >Unsupervised feature selection with the largest angle coding
【24h】

Unsupervised feature selection with the largest angle coding

机译:具有最大角度编码的无监督特征选择

获取原文
获取原文并翻译 | 示例
       

摘要

In many areas such as machine learning, data mining and computer vision, feature selection is a crucial and challenging task to find a relevant feature subset of the original features. Unsupervised feature selection is a type of feature selection which preforms the task without label information. Many unsupervised feature selection methods select the top rank features without the analysis of the differences among features, so they cannot select a feature subset with strong generality. With the analysis of the differences among features in unsupervised feature selection, original dataset can be described more comprehensively by selected features. In this paper, we propose the difference degree matrix and a new method called unsupervised feature selection with the largest angle coding (FSAC). The difference degree matrix is used to describe the difference degree of the distributions of the data points on every two features and FSAC is an effective feature selection method. Different from existing unsupervised feature selection methods, FSAC selects features through the analysis of the differences among features and the self-representation of the difference degree matrix. To make the self-representation of the difference degree matrix more useful and reduce the redundant and noisy features, ℓ_(21) -norm constraint is added into the objective function of FSAC to guarantee the feature selection matrix sparse in the rows. Experimental results on different real-world datasets show that the promising performance of FSAC outperforms the state-of-the-arts. We also analyse the sensitivity of the parameter in the objective function.
机译:在机器学习,数据挖掘和计算机视觉等许多领域,特征选择对于找到原始特征的相关特征子集是一项至关重要且具有挑战性的任务。无监督特征选择是一种特征选择,可以在没有标签信息的情况下执行任务。许多无人监督的特征选择方法会在不分析特征之间差异的情况下选择排名最高的特征,因此无法选择通用性强的特征子集。通过分析无监督特征选择中的特征之间的差异,可以通过选择的特征更全面地描述原始数据集。在本文中,我们提出了差分度矩阵和一种称为最大角度编码(FSAC)的无监督特征选择的新方法。差异度矩阵用于描述每两个特征上数据点分布的差异度,而FSAC是一种有效的特征选择方法。与现有的无监督特征选择方法不同,FSAC通过分析特征之间的差异以及差异度矩阵的自表示来选择特征。为了使差异度矩阵的自表示更加有用,并减少冗余和嘈杂的特征,将__(21)-norm约束添加到FSAC的目标函数中,以确保行中的特征选择矩阵稀疏。在不同的现实世界数据集上的实验结果表明,FSAC的有前途的性能优于最新技术。我们还分析了目标函数中参数的敏感性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号