首页> 外文期刊>BMC Bioinformatics >Finishing bacterial genome assemblies with Mix
【24h】

Finishing bacterial genome assemblies with Mix

机译:用Mix完成细菌基因组装配

获取原文
       

摘要

MotivationAmong challenges that hamper reaping the benefits of genome assembly are both unfinished assemblies and the ensuing experimental costs. First, numerous software solutions for genome de novo assembly are available, each having its advantages and drawbacks, without clear guidelines as to how to choose among them. Second, these solutions produce draft assemblies that often require a resource intensive finishing phase.MethodsIn this paper we address these two aspects by developing Mix , a tool that mixes two or more draft assemblies, without relying on a reference genome and having the goal to reduce contig fragmentation and thus speed-up genome finishing. The proposed algorithm builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a set of paths in the extension graph that maximizes the cumulative contig length.ResultsWe evaluate the performance of Mix on bacterial NGS data from the GAGE-B study and apply it to newly sequenced Mycoplasma genomes. Resulting final assemblies demonstrate a significant improvement in the overall assembly quality. In particular, Mix is consistent by providing better overall quality results even when the choice is guided solely by standard assembly statistics, as is the case for de novo projects.AvailabilityMix is implemented in Python and is available at https://github.com/cbib/MIX, novel data for our Mycoplasma study is available at http://services.cbib.u-bordeaux2.fr/mix/.
机译:动机阻碍获得基因组组装收益的挑战包括未完成的组装和随之而来的实验成本。首先,有许多用于基因组从头组装的软件解决方案,每种都有其优缺点,而没有关于如何选择的明确指南。其次,这些解决方案产生的草图装配通常需要大量的资源完成阶段。方法在本文中,我们通过开发Mix来解决这两个方面的问题,该工具可以将两个或多个草图装配混合在一起,而无需依赖参考基因组并且目标是减少重叠群片段化,从而加速基因组完成。所提出的算法构建了一个扩展图,其中顶点表示重叠群的末端,而边缘表示这些末端之间的现有路线。这些对齐边缘用于重叠群扩展。结果输出组件对应于扩展图中的一组路径,该路径使累积重叠群长度最大化。结果我们评估了Mix在来自GAGE-B研究的细菌NGS数据上的性能,并将其应用于新测序的支原体基因组。最终的组装证明了整体组装质量的显着提高。特别是,即使选择仅由标准装配体统计进行指导(如de novo项目),Mix仍可通过提供更好的整体质量结果来保持一致.AvailabilityMix是用Python实现的,可在https://github.com/获得cbib / MIX,有关支原体研究的新数据可在http://services.cbib.u-bordeaux2.fr/mix/获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号