首页> 外文期刊>BMC Medical Genomics >Estimates of array and pool-construction variance for planning efficient DNA-pooling genome wide association studies
【24h】

Estimates of array and pool-construction variance for planning efficient DNA-pooling genome wide association studies

机译:阵列和库构建方差的估计,以计划有效的DNA池基因组范围的关联研究

获取原文
       

摘要

Background Until recently, genome-wide association studies (GWAS) have been restricted to research groups with the budget necessary to genotype hundreds, if not thousands, of samples. Replacing individual genotyping with genotyping of DNA pools in Phase I of a GWAS has proven successful, and dramatically altered the financial feasibility of this approach. When conducting a pool-based GWAS, how well SNP allele frequency is estimated from a DNA pool will influence a study's power to detect associations. Here we address how to control the variance in allele frequency estimation when DNAs are pooled, and how to plan and conduct the most efficient well-powered pool-based GWAS. Methods By examining the variation in allele frequency estimation on SNP arrays between and within DNA pools we determine how array variance [var(earray)] and pool-construction variance [var(econstruction)] contribute to the total variance of allele frequency estimation. This information is useful in deciding whether replicate arrays or replicate pools are most useful in reducing variance. Our analysis is based on 27 DNA pools ranging in size from 74 to 446 individual samples, genotyped on a collective total of 128 Illumina beadarrays: 24 1M-Single, 32 1M-Duo, and 72 660-Quad. Results For all three Illumina SNP array types our estimates of var(earray) were similar, between 3-4 × 10-4 for normalized data. Var(econstruction) accounted for between 20-40% of pooling variance across 27 pools in normalized data. Conclusions We conclude that relative to var(earray), var(econstruction) is of less importance in reducing the variance in allele frequency estimation from DNA pools; however, our data suggests that on average it may be more important than previously thought. We have prepared a simple online tool, PoolingPlanner (available at http://?www.?kchew.?ca/?PoolingPlanner/? ), which calculates the effective sample size (ESS) of a DNA pool given a range of replicate array values. ESS can be used in a power calculator to perform pool-adjusted calculations. This allows one to quickly calculate the loss of power associated with a pooling experiment to make an informed decision on whether a pool-based GWAS is worth pursuing.
机译:背景技术直到最近,全基因组关联研究(GWAS)仍局限于研究小组,其预算需要对数百个(即使不是数千个)样本进行基因分型。事实证明,在GWAS的第一阶段用DNA池的基因分型代替个体基因分型是成功的,并且极大地改变了这种方法的财务可行性。进行基于池的GWAS时,从DNA池估计SNP等位基因频率的程度将影响研究检测关联的能力。在这里,我们讨论了如何在合并DNA时如何控制等位基因频率估算中的方差,以及如何计划和实施最有效的,功能强大的基于池的GWAS。方法通过检查DNA库之间和之内SNP阵列上等位基因频率估计的变化,我们确定阵列方差[var(e array )]和池结构方差[var(e construction < / sub>)]有助于等位基因频率估算的总方差。此信息对于确定复制阵列或复制池在减少差异方面最有用。我们的分析基于27个DNA池,大小从74到446个单个样本不等,对总共128个Illumina珠子阵列进行了基因分型:24个1M单晶,32个1M双核和72 660个四联体。结果对于所有三种Illumina SNP阵列类型,我们对var(e array )的估计值相似,归一化数据在3-4×10 -4 之间。在标准化数据中,Var(e construction )占27个池中池变化的20-40%。结论我们得出的结论是,相对于var(e array ),var(e construction )在减少DNA池等位基因频率估计的方差方面不那么重要。但是,我们的数据表明,平均而言,它可能比以前认为的更为重要。我们准备了一个简单的在线工具PoolingPlanner(可从http://?www。?kchew。?ca /?PoolingPlanner /?获得),该工具可以在一定范围的复制阵列范围内计算DNA池的有效样本量(ESS)。价值观。 ESS可用于功率计算器中,以执行池调整后的计算。这样一来,人们就可以快速计算出与池实验相关的功率损耗,从而做出基于池的GWAS是否值得追求的明智决策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号