首页> 美国卫生研究院文献>Bioinformatics >STatistical Inference Relief (STIR) feature selection
【2h】

STatistical Inference Relief (STIR) feature selection

机译:地层推演救济(STIR)功能选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

MotivationRelief is a family of machine learning algorithms that uses nearest-neighbors to select features whose association with an outcome may be due to epistasis or statistical interactions with other features in high-dimensional data. Relief-based estimators are non-parametric in the statistical sense that they do not have a parameterized model with an underlying probability distribution for the estimator, making it difficult to determine the statistical significance of Relief-based attribute estimates. Thus, a statistical inferential formalism is needed to avoid imposing arbitrary thresholds to select the most important features. We reconceptualize the Relief-based feature selection algorithm to create a new family of STatistical Inference Relief (STIR) estimators that retains the ability to identify interactions while incorporating sample variance of the nearest neighbor distances into the attribute importance estimation. This variance permits the calculation of statistical significance of features and adjustment for multiple testing of Relief-based scores. Specifically, we develop a pseudo t-test version of Relief-based algorithms for case-control data.
机译:MotivationRelief是一系列机器学习算法,使用最邻近算法来选择特征,这些特征与结果的关联可能是由于上位数据或与高维数据中其他特征的统计相互作用所致。在统计意义上,基于救济的估算器是非参数的,因为它们没有针对估算器具有潜在概率分布的参数化模型,因此很难确定基于救济的属性估算的统计意义。因此,需要统计推论形式主义,以避免施加任意阈值来选择最重要的特征。我们重新概念化基于救济的特征选择算法,以创建新的统计推论救济(STIR)估计器族,该估计器在将最近邻居距离的样本方差纳入属性重要性估计中的同时,保留了识别交互的能力。这种差异允许计算特征的统计显着性,并可以对基于救济的评分进行多次测试。具体来说,我们为案例控制数据开发了基于Relief的算法的伪t检验版本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号