首页> 外文期刊>Nucleic Acids Research >An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data
【24h】

An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data

机译:从宏基因组学下一代测序数据中显着改善微生物基因组从头组装的整体策略

获取原文
获取原文并翻译 | 示例
           

摘要

Next-generation sequencing (NGS) approaches rapidly produce millions to billions of short reads, which allow pathogen detection and discovery in human clinical, animal and environmental samples. A major limitation of sequence homology-based identification for highly divergent microorganisms is the short length of reads generated by most highly parallel sequencing technologies. Short reads require a high level of sequence similarities to annotated genes to confidently predict gene function or homology. Such recognition of highly divergent homologues can be improved by reference-free (de novo) assembly of short overlapping sequence reads into larger contigs. We describe an ensemble strategy that integrates the sequential use of various de Bruijn graph and overlap-layout-consensus assemblers with a novel partitioned sub-assembly approach. We also proposed new quality metrics that are suitable for evaluating metagenome de novo assembly. We demonstrate that this new ensemble strategy tested using in silico spike-in, clinical and environmental NGS datasets achieved significantly better contigs than current approaches.
机译:下一代测序(NGS)方法可快速产生数百万至数十亿个短读片段,从而可在人类临床,动物和环境样品中检测和发现病原体。对于高度趋异的微生物,基于序列同源性的鉴定的主要限制是由大多数高度平行的测序技术所产生的读段的长度较短。短读需要与注释的基因高度相似的序列,才能可靠地预测基因功能或同源性。高度重叠的同源物的这种识别可以通过将短的重叠序列读数无参考地(从头)组装成较大的重叠群来改善。我们描述了一种集成策略,该策略将各种de Bruijn图和重叠布局共识汇编程序的顺序使用与一种新颖的分区子汇编方法相集成。我们还提出了适用于评估元基因组从头装配的新质量指标。我们证明,使用计算机模拟尖峰插入,临床和环境NGS数据集测试的这一新整体策略比当前方法获得了更好的重叠群。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号