首页> 外文会议>Pacific Symposium on Biocomputing >IMPUTATION-BASED ASSESSMENT OF NEXT GENERATION RARE EXOME VARIANT ARRAYS
【24h】

IMPUTATION-BASED ASSESSMENT OF NEXT GENERATION RARE EXOME VARIANT ARRAYS

机译:基于退缩的下一代稀有exome变体阵列的评估

获取原文

摘要

A striking finding from recent large-scale sequencing efforts is that the vast majority of variants in the human genome are rare and found within single populations or lineages. These observations hold important implications for the design of the next round of disease variant discovery efforts—if genetic variants that influence disease risk follow the same trend, then we expect to see population-specific disease associations that require large samples sizes for detection. To address this challenge, and due to the still prohibitive cost of sequencing large cohorts, researchers have developed a new generation of low-cost genotyping arrays that assay rare variation previously identified from large exome sequencing studies. Genotyping approaches rely not only on directly observing variants, but also on phasing and imputation methods that use publicly available reference panels to infer unobserved variants in a study cohort. Rare variant exome arrays are intentionally enriched for variants likely to be disease causing, and here we assay the ability of the first commercially available rare exome variant array (the Illumina Infinium HumanExome BeadChip) to also tag other potentially damaging variants not molecularly assayed. Using full sequence data from chromosome 22 from the phase I 1000 Genomes Project, we evaluate three methods for imputation (BEAGLE, MaCH-Admix, and SHAPEIT2/IMPUTE2) with the rare exome variant array under varied study panel sizes, reference panel sizes, and LD structures via population differences. We find that imputation is more accurate across both the genome and exome for common variant arrays than the next generation array for all allele frequencies, including rare alleles. We also find that imputation is the least accurate in African populations, and accuracy is substantially improved for rare variants when the same population is included in the reference panel. Depending on the goals of GWAS researchers, our results will aid budget decisions by helping determine whether money is best spent sequencing the genomes of smaller sample sizes, genotyping larger sample sizes with rare and/or common variant arrays and imputing SNPs, or some combination of the two.
机译:从最近的大规模测序努力引人注目的是,人类基因组中绝大多数变体都是罕见的,在单一人群或谱系中发现。这些观察结果对下一轮疾病变异发现努力的设计具有重要意义 - 如果影响疾病风险的遗传变异遵循相同的趋势,我们希望看到需要大型样品尺寸进行检测的人口特异性疾病关联。为了解决这一挑战,并且由于仍然令人满意的测序大队列成本,研究人员已经开发出了新一代的低成本基因分型阵列,该阵列进行了从大型外壳测序研究中鉴定的罕见变异。基因分型方法不仅依赖于直接观察变体,还依赖于直接观察变体,还依赖于使用公开可用的参考面板推断在研究队列中的未观察到的变体的阶段和估算方法。罕见的变体外壳阵列是有意富含疾病的变体,并且在这里我们测定了第一商业上可获得的稀有稀有变异阵列(Illumina人体内胚珠)的能力,还标记了不分子测定的其他潜在损伤的变体。从阶段I 1000基因组项目中使用来自染色体22的完整序列数据,我们在不同的研究面板尺寸下,参考面板尺寸和稀有exome变体阵列评估三种归属(比猎犬,Mach-Accix和ShapeIt2 / Impute2)的方法。 LD结构通过人口差异。我们发现,对于所有等位基因频率的下一代阵列,普通变体阵列的基因组和极端都是更准确的,包括所有等位基因频率,包括罕见的等位基因。我们还发现,由于在参考文献中包含相同的人群时,难以在非洲群体中最不准确,并且在罕见的群体中罕见的变种显着改善了准确性。根据GWAS研究人员的目标,我们的结果将帮助预算决策,通过帮助确定金钱最佳地排序较小样本尺寸的基因组,基因分型较大的样本尺寸与罕见的和/或常见变体阵列和抵抗SNP或某种组合他们俩。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号