...
首页> 外文期刊>Nature biotechnology >Hybrid error correction and de novo assembly of single-molecule sequencing reads
【24h】

Hybrid error correction and de novo assembly of single-molecule sequencing reads

机译:单分子测序读物的混合错误校正和从头组装

获取原文
获取原文并翻译 | 示例

摘要

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.
机译:单分子测序仪可以产生多碱基序列,具有极大改善基因组和转录组装配的潜力。但是,单分子读取的错误率很高,迄今为止,它们的使用仅限于对细菌进行重测序。为了解决此限制,我们介绍了一种校正算法和组装策略,该算法和校正策略使用短的高保真序列来校正单分子序列中的错误。我们展示了这种方法在由PacBio RS仪器从噬菌体,原核和真核全基因组(包括鹦鹉未分类的鹦鹉基因Melopsittacus undulatus)以及RNA-Seq读取的玉米(Zea mays)全基因组生成的读取物中的效用转录组。我们的长读校正可实现> 99.9%的碱基检出准确度,从而使装配比目前的测序策略好得多:在最好的例子中,重叠群中位数相对于高覆盖率的第二代装配是原来的五倍。如果读取长度继续增加,则可以预测会有更大的收益,包括单重叠群细菌染色体装配的前景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号