...
首页> 外文期刊>Genome Research >Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding
【24h】

Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding

机译:使用两碱基编码的短读大规模平行连接测序发现的人类基因组序列和结构变异

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding ∼18× haploid coverage of aligned sequence and close to 300× clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.
机译:我们使用一种新颖的基于连接的测序方法描述了非洲裔匿名人士的基因组测序,这种测序方法能够实现独特形式的错误校正,从而将比对读数的原始准确性提高到> 99.9%,从而使我们能够准确地将SNP称为每个等位基因很少读取两次。我们收集了数十亿对配对读段,产生了约18倍比对序列的单倍体覆盖率和近300倍克隆覆盖率。超过98%的参考基因组被至少一个独特放置的读物覆盖,而99.65%被至少一个独特放置的配对配对克隆覆盖。我们确定了380万个SNP,其中19%是新颖的。配对配对的数据用于物理解析获得的基因型的近三分之二的单倍型阶段,并产生高达215 kb的阶段性片段。我们检测到226,529个内部读取的indel,在配对配对的读取之间有5590个indel,91个倒位和四个基因融合。我们使用一种新颖的方法来检测配对配对读取之间的插入缺失,这些配对配对读取的片段小于文库插入片段大小的标准差,并发现与我们内部读取方法检测到的缺失相同的缺失。在该个体中鉴定出数十种先前在OMIM中描述的突变以及先前与疾病有关的基因中的数百个非同义单核苷酸和结构变异。人类基因组中还有更多的遗传变异尚待发现,我们为将来的人群和癌症活检提供了指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号