We present a novel method for controlling the $k$-familywise error rate ($k$-FWER) in the linear regression setting using the knockoffs framework first introduced by Barber and Candès. Our procedure, which we also refer to as knockoffs, can be applied with any design matrix with at least as many observations as variables, and does not require knowing the noise variance. Unlike other multiple testing procedures which act directly on $p$-values, knockoffs is specifically tailored to linear regression and implicitly accounts for the statistical relationships between hypothesis tests of different coefficients. We prove that knockoffs controls the $k$-FWER exactly in finite samples and show in simulations that it provides superior power to alternative procedures over a range of linear regression problems. We also discuss extensions to controlling other Type I error rates such as the false exceedance rate, and use it to identify candidates for mutations conferring drug-resistance in HIV.
展开▼
机译:我们提出了一种新颖的方法,该方法使用Barber和Candès首次提出的仿制框架在线性回归设置中控制$ k $-家庭错误率($ k $ -FWER)。我们的程序(也称为“仿冒品”)可以与至少具有与变量一样多的观察值的任何设计矩阵一起应用,并且不需要知道噪声方差。与直接作用于$ p $值的其他多重检验程序不同,仿制特别针对线性回归量身定制,并隐含考虑了不同系数的假设检验之间的统计关系。我们证明了仿冒品可以在有限的样本中精确地控制$ k $ -FWER,并在模拟中表明,它在一系列线性回归问题上为替代程序提供了强大的功能。我们还讨论了控制其他I类错误率(例如错误超标率)的扩展,并使用它来识别可赋予HIV耐药性的突变候选对象。
展开▼