首页> 美国卫生研究院文献>other >CONFOUNDER ADJUSTMENT IN MULTIPLE HYPOTHESIS TESTING
【2h】

CONFOUNDER ADJUSTMENT IN MULTIPLE HYPOTHESIS TESTING

机译:在多个假设测试中对参与者的调整

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We consider large-scale studies in which thousands of significance tests are performed simultaneously. In some of these studies, the multiple testing procedure can be severely biased by latent confounding factors such as batch effects and unmeasured covariates that correlate with both primary variable(s) of interest (e.g., treatment variable, phenotype) and the outcome. Over the past decade, many statistical methods have been proposed to adjust for the confounders in hypothesis testing. We unify these methods in the same framework, generalize them to include multiple primary variables and multiple nuisance variables, and analyze their statistical properties. In particular, we provide theoretical guarantees for RUV-4 [Gagnon-Bartsch, Jacob and Speed (2013)] and LEAPP [Ann. Appl. Stat. >6 (2012) 1664–1688], which correspond to two different identification conditions in the framework: the first requires a set of “negative controls” that are known a priori to follow the null distribution; the second requires the true nonnulls to be sparse. Two different estimators which are based on RUV-4 and LEAPP are then applied to these two scenarios. We show that if the confounding factors are strong, the resulting estimators can be asymptotically as powerful as the oracle estimator which observes the latent confounding factors. For hypothesis testing, we show the asymptotic z-tests based on the estimators can control the type I error. Numerical experiments show that the false discovery rate is also controlled by the Benjamini–Hochberg procedure when the sample size is reasonably large.
机译:我们考虑同时进行成千上万的重要性测试的大规模研究。在其中一些研究中,多重测试程序可能会因潜在的混杂因素(例如批次效应和与目标主要变量(例如治疗变量,表型)和结果均相关的未测量的协变量)而严重偏差。在过去的十年中,提出了许多统计方法来调整假设检验中的混杂因素。我们将这些方法统一在同一框架中,将它们概括为包括多个主要变量和多个令人讨厌的变量,并分析其统计特性。特别是,我们为RUV-4 [Gagnon-Bartsch,Jacob and Speed(2013)]和LEAPP [Ann。应用统计> 6 (2012)1664–1688],它们对应于框架中的两个不同的识别条件:第一个条件需要先验地遵循零分布的一组“负控制”;第二个要求真正的非零值是稀疏的。然后,将基于RUV-4和LEAPP的两个不同的估计器应用于这两种情况。我们表明,如果混杂因素很强,则所得的估计量可以渐进地与观察潜在混杂因素的预言机一样强大。对于假设检验,我们展示了基于估计量的渐近z检验可以控制I型误差。数值实验表明,当样本量较大时,本杰米尼-霍奇伯格程序也会控制错误发现率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号