首页> 美国卫生研究院文献>BMC Medical Research Methodology >A measure of the impact of CV incompleteness on prediction error estimation with application to PCA and normalization
【2h】

A measure of the impact of CV incompleteness on prediction error estimation with application to PCA and normalization

机译:CV不完整对预测误差估计的影响的量度(应用于PCA和归一化)

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundIn applications of supervised statistical learning in the biomedical field it is necessary to assess the prediction error of the respective prediction rules. Often, data preparation steps are performed on the dataset—in its entirety—before training/test set based prediction error estimation by cross-validation (CV)—an approach referred to as “incomplete CV”. Whether incomplete CV can result in an optimistically biased error estimate depends on the data preparation step under consideration. Several empirical studies have investigated the extent of bias induced by performing preliminary supervised variable selection before CV. To our knowledge, however, the potential bias induced by other data preparation steps has not yet been examined in the literature. In this paper we investigate this bias for two common data preparation steps: normalization and principal component analysis for dimension reduction of the covariate space (PCA). Furthermore we obtain preliminary results for the following steps: optimization of tuning parameters, variable filtering by variance and imputation of missing values.
机译:背景技术在生物医学领域的监督统计学习中,有必要评估各个预测规则的预测误差。通常,在通过交叉验证(CV)基于训练/测试集的预测误差估计之前,对数据集执行全部数据准备步骤,这种方法称为“不完全CV”。不完整的CV是否会导致乐观的误差估计取决于正在考虑的数据准备步骤。几项实证研究已经研究了在CV之前进行初步监督变量选择所引起的偏差程度。据我们所知,文献中尚未检查其他数据准备步骤所引起的潜在偏差。在本文中,我们针对两个常见的数据准备步骤研究了这种偏差:归一化和主成分分析,以减少协变量空间(PCA)的维数。此外,我们获得了以下步骤的初步结果:优化调整参数,通过方差进行变量过滤和插补缺失值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号