首页> 外文期刊>Source Code for Biology and Medicine >Combining de novo and reference-guided assembly with scaffold_builder
【24h】

Combining de novo and reference-guided assembly with scaffold_builder

机译:将de novo和参考导向的装配与scaffold_builder结合

获取原文
获取原文并翻译 | 示例
           

摘要

Genome sequencing has become routine, however genome assembly still remains a challenge despite the computational advances in the last decade. In particular, the abundance of repeat elements in genomes makes it difficult to assemble them into a single complete sequence. Identical repeats shorter than the average read length can generally be assembled without issue. However, longer repeats such as ribosomal RNA operons cannot be accurately assembled using existing tools. The application Scaffold_builder was designed to generate scaffolds – super contigs of sequences joined by N-bases – based on the similarity to a closely related reference sequence. This is independent of mate-pair information and can be used complementarily for genome assembly, e.g. when mate-pairs are not available or have already been exploited. Scaffold_builder was evaluated using simulated pyrosequencing reads of the bacterial genomes Escherichia coli 042, Lactobacillus salivarius UCC118 and Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12. Moreover, we sequenced two genomes from Salmonella enterica serovar Typhimurium LT2 G455 and Salmonella enterica serovar Typhimurium SDT1291 and show that Scaffold_builder decreases the number of contig sequences by 53% while more than doubling their average length. Scaffold_builder is written in Python and is available at http://​edwards.​sdsu.​edu/​scaffold_​builder. A web-based implementation is additionally provided to allow users to submit a reference genome and a set of contigs to be scaffolded.
机译:基因组测序已成为常规,但是尽管最近十年来在计算方面取得了进步,但基因组组装仍然是一个挑战。特别地,基因组中重复元件的丰富使得难以将它们组装成单个完整序列。短于平均读取长度的相同重复序列通常可以毫无问题地进行组装。但是,较长的重复序列(如核糖体RNA操纵子)无法使用现有工具准确组装。应用Scaffold_builder被设计为基于与密切相关的参考序列的相似性来生成支架-由N-碱基连接的序列的超级重叠群。这与伴侣对信息无关,可以互补地用于基因组装配,例如当伴侣对不可用或已经被利用时。使用模拟的焦磷酸测序读取细菌基因组大肠杆菌042,唾液乳杆菌UCC118和肠炎沙门氏菌亚种评估了Scaffold_builder。肠型血清型Typhi海峡P-stx-12。此外,我们对肠炎沙门氏菌鼠伤寒沙门氏菌LT2 G455和肠炎沙门氏菌鼠伤寒沙门氏菌SDT1291的两个基因组进行了测序,结果显示Scaffold_builder将重叠群序列的数量减少了53%,而其平均长度却增加了一倍以上。 Scaffold_builder用Python编写,可从http://edwards.sdsu.edu/scaffold_builder获得。另外提供了基于网络的实现,以允许用户提交参考基因组和一组待构建的重叠群。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号