【24h】

A Hypergraph-Based Approach to Feature Selection

机译:基于超图的特征选择方法

获取原文

摘要

In many data analysis tasks, one is often confronted with the problem of selecting features from very high dimensional data. The feature selection problem is essentially a combinatorial optimization problem which is computationally expensive. To overcome this problem it is frequently assumed that either features independently influence the class variable or do so only involving pairwise feature interaction. To overcome this problem, we draw on recent work on hyper-graph clustering to extract maximally coherent feature groups from a set of objects using high-order (rather than pairwise) similarities. We propose a three step algorithm that, namely, i) first constructs a graph in which each node corresponds to each feature, and each edge has a weight corresponding to the interaction information among features connected by that edge, ii) perform hypergraph clustering to select a highly coherent set of features, iii) further selects features based on a new measure called the multidimensional interaction information (Mil). The advantage of Mil is that it incorporates third or higher order feature interactions. This is realized using hypergraph clustering, which separates features into clusters prior to selection, thereby allowing us to limit the search space for higher order interactions. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard data-sets.
机译:在许多数据分析任务中,经常会遇到从非常高维度的数据中选择特征的问题。特征选择问题本质上是组合优化问题,该组合优化问题在计算上是昂贵的。为了克服这个问题,经常假设特征要么独立地影响类变量,要么仅涉及成对特征交互。为了克服这个问题,我们利用超图聚类的最新工作来使用高阶(而不是成对)相似性从一组对象中提取最大一致的特征组。我们提出了一种三步算法,即,i)首先构造一个图,其中每个节点对应于每个特征,并且每个边缘的权重对应于该边缘连接的特征之间的交互信息,ii)执行超图聚类以进行选择高度相关的一组特征,iii)根据称为多维交互信息(Mil)的新度量进一步选择特征。 Mil的优点是它合并了三阶或更高阶特征交互。这是通过使用超图聚类实现的,它在选择之前将特征分为多个聚类,从而允许我们限制搜索空间以进行更高阶的交互。实验结果证明了我们的特征选择方法在许多标准数据集上的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号