...
首页> 外文期刊>Bioinformatics >YOABS: yet other aligner of biological sequences-an efficient linearly scaling nucleotide aligner
【24h】

YOABS: yet other aligner of biological sequences-an efficient linearly scaling nucleotide aligner

机译:YOABS:生物序列的其他比对器-一种有效的线性缩放核苷酸比对器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Motivation: Explosive growth of short-read sequencing technologies in the recent years resulted in rapid development of many new alignment algorithms and programs. But most of them are not efficient or not applicable for reads greater than or similar to 200 bp because these algorithms specifically designed to process short queries with relatively low sequencing error rates. However, the current trend to increase reliability of detection of structural variations in assembled genomes as well as to facilitate de novo sequencing demand complimenting high-throughput short-read platforms with long-read mapping. Thus, algorithms and programs for efficient mapping of longer reads are becoming crucial. However, the choice of long-read aligners effective in terms of both performance and memory are limited and includes only handful of hash table (BLAT, SSAHA2) or trie (Burrows-Wheeler Transform -Smith-Waterman (BWT-SW), Burrows-Wheeler Alignerr -Smith-Waterman (BWA-SW)) based algorithms. Results: New O(n) algorithm that combines the advantages of both hash and trie-based methods has been designed to effectively align long biological sequences (greater than or similar to 200 bp) against a large sequence database with small memory footprint (e.g. greater than or similar to 2 GB for the human genome). The algorithm is accurate and significantly more fast than BLAT or BWT-SW, but similar to BWT-SW it can find all local alignments. It is as accurate as SSAHA2 or BWA-SW, but uses 3+ times less memory and 10+ times faster than SSAHA2, several times faster than BWA-SW with low error rates and almost two times less memory.
机译:动机:近年来,短读测序技术的爆炸式增长导致许多新的比对算法和程序的迅速发展。但是,大多数算法效率不高或不适用于大于或等于200 bp的读取,因为这些算法专门设计用于处理序列错误率相对较低的短查询。但是,当前趋势是提高组装基因组中结构变异检测的可靠性,并促进从头测序,这需要补充具有长读图的高通量短读平台。因此,有效映射较长读段的算法和程序变得至关重要。但是,在性能和内存方面均有效的长读对齐器的选择受到限制,并且仅包含少数哈希表(BLAT,SSAHA2)或特里(Burrows-Wheeler转换-Smith-Waterman(BWT-SW),Burrows-基于Wheeler Alignerr -Smith-Waterman(BWA-SW))的算法。结果:新的O(n)算法结合了基于散列和基于Trie的方法的优点,其设计旨在有效地将长的生物学序列(大于或等于200 bp)与具有较小内存占用空间(例如更大)的大型序列数据库进行比对大于或等于2 GB(对于人类基因组)。该算法准确且比BLAT或BWT-SW更快,但是与BWT-SW相似,它可以找到所有局部比对。它的精度与SSAHA2或BWA-SW一样,但使用的内存少3倍以上,比SSAHA2快10倍以上,比BWA-SW快几倍,错误率低,内存少近两倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号