首页> 外文期刊>Genetic epidemiology. >On Association Analysis of Rare Variants Under Population Substructure: An Approach for the Detection of Subjects That Can Cause Bias in the Analysis-Topt: An Outlier Detection Method
【24h】

On Association Analysis of Rare Variants Under Population Substructure: An Approach for the Detection of Subjects That Can Cause Bias in the Analysis-Topt: An Outlier Detection Method

机译:人口子结构下的稀有变异的关联分析:一种可在Analysis-Topt中引起偏倚的主题检测方法:一种异常值检测方法

获取原文
获取原文并翻译 | 示例
       

摘要

For the analysis of rare-variant data in population-based designs, we propose a method to detect study subjects that may create population substructure in the study sample. Our approach is computationally fast and simple, permitting applications to whole-genome sequencing studies. The method does not require the variants to be in linkage equilibrium and can be applied to all the genetic loci that are available in the study. For both rare and common variants, we assess the performance of our approach by its application to the 1000 Genome Project data, and in simulation studies. The results are compared to the commonly used outlier detection algorithm based on principal component analysis (PCA). The statistical power of both approaches to detect outliers are comparable in most of the scenarios, but the power of PCA to detect outliers is lower than the novel approach in the presence of linkage disequilibrium and for subpopulations that are genetically similar. The data analysis and the simulation studies suggest that the number of false-positive results appears to be different for the two approaches. Our approach maintains the type I error rate while the outlier detection approach based on PCA does not. Taking additionally into account the minimal computational requirements of our approach and the ability to incorporate all the marker information, the proposed method will have important application in sequencing studies and genome-wide association studies.
机译:为了分析基于人群的设计中的稀有数据,我们提出了一种检测可能在研究样本中产生人群子结构的研究对象的方法。我们的方法计算快速,简单,可应用于全基因组测序研究。该方法不需要变体处于连锁平衡状态,并且可以应用于研究中可用的所有遗传基因座。对于稀有和常见变体,我们通过将其应用于1000个基因组计划数据以及模拟研究中,评估该方法的性能。将结果与基于主成分分析(PCA)的常用异常值检测算法进行比较。在大多数情况下,这两种检测异常值的方法的统计能力是可比的,但是在存在连锁不平衡和遗传相似的亚群的情况下,PCA检测异常值的能力比新颖方法低。数据分析和模拟研究表明,两种方法的假阳性结果数量似乎不同。我们的方法保持了I型错误率,而基于PCA的异常检测方法则没有。此外,考虑到我们方法的最低计算要求以及合并所有标记信息的能力,提出的方法将在测序研究和全基因组关联研究中具有重要的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号