首页> 外文期刊>GigaScience >Deep whole-genome sequencing of 90 Han Chinese genomes
【24h】

Deep whole-genome sequencing of 90 Han Chinese genomes

机译:90个汉族基因组的深层全基因组测序

获取原文
           

摘要

Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (~×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency
机译:下一代测序可提供对人类遗传信息的高分辨率洞察。但是,由于测序成本高,以前的研究重点主要是低覆盖率数据。尽管“ 1000基因组计划”和“单倍型参考协会”都为归因提供了强大的参考面板,但基于低覆盖率数据,低频和新颖变体仍然难以准确发现和调用。深度测序为这些低频和新型变异的问题提供了最佳解决方案。尽管全外显子测序对于外显子区域也是可行的选择,但它不能解释非编码区域,有时会导致缺少重要的因果变体。对于汉族人群,大多数变异是根据“ 1000基因组计划”的低覆盖率数据发现的。但是,高覆盖率的全基因组测序数据对任何人群都是有限的,并且大量低频的,特定人群的变体仍然没有特征。我们从90个无关的中国血统个体中进行了高深度(〜×80)全基因组测序,该个体来自1000个基因组计划样本,包括45个北汉人和45个南汉人样本。 1000个基因组计划已对这90个中的83个进行了测序。我们从这90个样品中鉴定出12 568 804个单核苷酸多态性,2 074 210个短InDels和26 142个结构变异。与来自1000个基因组计划的汉族数据相比,我们发现了7 629个具有低频率(定义为次要等位基因频率)的新变异

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号