首页> 外文期刊>DNA research: an international journal for rapid publication of reports on genes and genomes >MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
【24h】

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning

机译:MetaVelvet-SL:利用监督学习将Velvet汇编程序扩展到从头宏基因组汇编程序

获取原文
获取原文并翻译 | 示例
           

摘要

The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of the single-genome assembler Velvet. It has been proved to generate assemblies with higher N50 scores and higher quality than single-genome assemblers such as Velvet and SOAPdenovo when applied to metagenomic sequence reads and is frequently used in this research community. One important open problem for MetaVelvet is its low accuracy and sensitivity in detecting chimeric nodes in the assembly (de Bruijn) graph, which prevents the generation of longer contigs and scaffolds. We have tackled this problem of classifying chimeric nodes using supervised machine learning to significantly improve the performance of MetaVelvet and developed a new tool, called MetaVelvet-SL. A Support Vector Machine is used for learning the classification model based on 94 features extracted from candidate nodes. In extensive experiments, MetaVelvet-SL outperformed the original MetaVelvet and other state-of-the-art metagenomic assemblers, IDBA-UD, Ray Meta and Omega, to reconstruct accurate longer assemblies with higher N50 scores for both simulated data sets and real data sets of human gut microbial sequences.
机译:从混合序列读取中组装多个基因组是宏基因组学分析的瓶颈。单基因组组装程序(汇编程序)无法解析元基因组序列,因此已开发出专门为宏基因组学设计的汇编程序。 MetaVelvet是单基因组组装程序Velvet的扩展。与单基因组汇编程序(如Velvet和SOAPdenovo)相比,已被证明能够生成具有更高的N50得分和更高质量的程序集,并将其应用于宏基因组序列读取,并经常在该研究社区中使用。对于MetaVelvet来说,一个重要的开放性问题是它在检测装配图中的嵌合节点时的准确性和灵敏度低(de Bruijn),这阻止了更长的重叠群和支架的产生。我们已经解决了使用有监督的机器学习来对嵌合节点进行分类的问题,以显着提高MetaVelvet的性能,并开发了一种名为MetaVelvet-SL的新工具。支持向量机用于基于从候选节点中提取的94个特征学习分类模型。在广泛的实验中,MetaVelvet-SL优于原始的MetaVelvet和其他最新的宏基因组学汇编程序IDBA-UD,Ray Meta和Omega,可为模拟数据集和真实数据集重构具有较高N50分数的准确的较长程序集人类肠道微生物序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号