首页> 外文期刊>The American Journal of Human Genetics >Accurate and Fast Multiple-Testing Correction in eQTL Studies
【24h】

Accurate and Fast Multiple-Testing Correction in eQTL Studies

机译:EQTL研究中的准确性和快速多次测试校正

获取原文
获取原文并翻译 | 示例
           

摘要

In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset.
机译:在表达定量性状基因座(EQTL)的研究中,鉴定eGENES的感兴趣越来越令人利益,其表达水平与特定遗传变异的变异相关的基因。检测egenes对于后续分析和优先级是重要的,因为基因是生物过程中的主要实体。为了检测egenes,一种通常专注于遗传变体,其具有基因的CIS中所有变体中的最小P值并校正多次测试以获得基因级P值。为了执行多次测试校正,广泛使用置换测试。然而,由于EQTL研究的越来越多的样本尺寸,置换测试已成为EQTL研究中的计算瓶颈。在本文中,我们提出了一种有效的方法来校正多次测试并通过利用多元正态分布来评估EDENE P值。我们的方法适当考虑到变体之间的联动 - 不平衡结构,其时间复杂性与样本大小无关。通过应用我们的小样本校正技术,我们的方法在小型和大型研究中实现了高精度。我们已经表明,我们的方法一直产生具有不同样本尺寸和SNP密度的三个人EQTL数据集的极其准确的P值(精度> 98%):基因型组织表达式飞行员数据集,多区域脑数据集和HAPMAP 3数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号