首页> 美国卫生研究院文献>Bioinformatics >A quality control algorithm for filtering SNPs in genome-wide association studies
【2h】

A quality control algorithm for filtering SNPs in genome-wide association studies

机译:在全基因组关联研究中过滤SNP的质量控制算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: The quality control (QC) filtering of single nucleotide polymorphisms (SNPs) is an important step in genome-wide association studies to minimize potential false findings. SNP QC commonly uses expert-guided filters based on QC variables [e.g. Hardy–Weinberg equilibrium, missing proportion (MSP) and minor allele frequency (MAF)] to remove SNPs with insufficient genotyping quality. The rationale of the expert filters is sensible and concrete, but its implementation requires arbitrary thresholds and does not jointly consider all QC features.>Results: We propose an algorithm that is based on principal component analysis and clustering analysis to identify low-quality SNPs. The method minimizes the use of arbitrary cutoff values, allows a collective consideration of the QC features and provides conditional thresholds contingent on other QC variables (e.g. different MSP thresholds for different MAFs). We apply our method to the seven studies from the Wellcome Trust Case Control Consortium and the major depressive disorder study from the Genetic Association Information Network. We measured the performance of our method compared to the expert filters based on the following criteria: (i) percentage of SNPs excluded due to low quality; (ii) inflation factor of the test statistics (λ); (iii) number of false associations found in the filtered dataset; and (iv) number of true associations missed in the filtered dataset. The results suggest that with the same or fewer SNPs excluded, the proposed algorithm tends to give a similar or lower value of λ, a reduced number of false associations, and retains all true associations.>Availability: The algorithm is available at >Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:对单核苷酸多态性(SNP)进行质量控制(QC)过滤是全基因组关联研究中将潜在错误发现最小化的重要步骤。 SNP QC通常使用基于QC变量的专家指导过滤器[例如, Hardy-Weinberg平衡,缺失比例(MSP)和次要等位基因频率(MAF)],以去除基因型质量不足的SNP。专家过滤器的原理合理且具体,但是其实现需要任意阈值,并且不能共同考虑所有QC功能。>结果:我们提出了一种基于主成分分析和聚类分析的算法,识别低质量的SNP。该方法最小化了任意截止值的使用,允许对QC特征的综合考虑,并提供取决于其他QC变量的条件阈值(例如,针对不同MAF的不同MSP阈值)。我们将我们的方法应用于来自Wellcome Trust病例对照协会的七项研究和来自遗传协会信息网络的主要抑郁症研究。我们根据以下标准与专家过滤器相比,测量了我们方法的性能:(i)由于质量低而被排除的SNP百分比; (ii)检验统计的通货膨胀系数(λ); (iii)在过滤后的数据集中发现虚假关联的数量; (iv)过滤后的数据集中遗漏的真实关联数。结果表明,在排除相同或更少的SNP的情况下,所提出的算法往往会给出相似或较低的λ值,减少了错误的关联数,并保留了所有真实的关联。>可用性:可从>联系人: >补充信息:在在线生物信息学中获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号