...
首页> 外文期刊>The American statistician >Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context
【24h】

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

机译:基因组学中交叉验证与Oracle方法的经验性能

获取原文
获取原文并翻译 | 示例
           

摘要

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m -fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.
机译:当采用具有Oracle属性的模型选择方法(例如,平滑裁剪的绝对偏差(SCAD)和自适应套索)时,通常通过m倍交叉验证来估计平滑参数,例如m = 10。真正的回归函数稀疏且信号较大,此类交叉验证通常效果很好。但是,在涉及单核苷酸多态性(SNP)的基因组研究的回归建模中,真正的回归功能虽然被认为是稀疏的,但并没有大的信号。我们凭经验证明,在此类问题中,使用SCAD和具有10倍交叉验证的自适应套索的选定变量数是一个随机变量,其变化很大且令人惊讶。类似的说明适用于非Oracle方法(例如套索)。我们的研究强烈质疑仅使用任何oracle方法执行单次m倍交叉验证的适用性,而不仅仅是SCAD和自适应套索。

著录项

  • 来源
    《The American statistician》 |2011年第4期|p.223-228|共6页
  • 作者单位

    Department of Epidemiology & Bio-statistics, School of Rural Public Health, Texas A&M Health Science Center, 1266 TAMU, College Station, TX 77843-1266;

    Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143;

    School of Mathematics and Statistics, University of Sydney, NSW 2006 Australia;

    Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS 8038 Rockville, MD 20852;

    Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd, EPS 8038 Rockville, MD 20852;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    adaptive lasso; lasso; model selection; oracle estimation;

    机译:自适应套索套索;型号选择;预言;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号