首页> 外文期刊>Bioinformatics >Assembling millions of short DNA sequences using SSAKE
【24h】

Assembling millions of short DNA sequences using SSAKE

机译:使用SSAKE组装数百万条短DNA序列

获取原文
获取原文并翻译 | 示例
       

摘要

Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in large genomes and the inability of short sequences to uniquely and unambiguously characterize them, the short read length limits applicability for de novo sequencing. However, given the sequencing depth and the throughput of this instrument, stringent assembly of highly identical sequences can be achieved. We describe SSAKE, a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences. SSAKE is designed to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets.
机译:新型DNA测序技术的潜力比传统的Sanger测序技术高出三个数量级。现在可以从Solexa Ltd购买的仪器产生数百万个25 nt的短DNA序列。由于大基因组中无处不在的重复,以及短序列无法唯一和明确地表征它们,因此短读长度限制了从头测序的适用性。但是,考虑到该仪器的测序深度和通量,可以实现高度相同序列的严格组装。我们描述了SSAKE,它是一种工具,它通过逐步搜索前缀树以寻找任意两个序列之间最长的重叠来主动组装数百万个短核苷酸序列。 SSAKE旨在通过将短序列读取严格地组装成连续序列,从而表征新的测序靶标,从而帮助利用短序列读取的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号