首页> 外文会议>International Conference on Bioinformatics and Biomedical Engineering >AC-DIAMOND: Accelerating Protein Alignment via Better SIMD Parallelization and Space-Efficient Indexing
【24h】

AC-DIAMOND: Accelerating Protein Alignment via Better SIMD Parallelization and Space-Efficient Indexing

机译:AC-DIAMOND:通过更好的SIMD并行化和节省空间的索引来加速蛋白质比对

获取原文

摘要

To speed up the alignment of DNA reads or assembled con-tigs against a protein database has been a challenge up to now. The recent tool DIAMOND has significantly improved the speed of BLASTX and RAPSearch, while giving similar degree of sensitivity. Yet for applications like metagenomics, where large amount of data is involved, DIAMOND still takes a lot of time. This paper introduces an even faster protein alignment tool, called AC-DIAMOND, which attempts to speed up DIAMOND via better SIMD parallelization and more space-efficient indexing of the reference database; the latter allows more queries to be loaded into the memory and processed together. Experimental results show that AC-DIAMOND is about 4 times faster than DIAMOND on aligning DNA reads or contigs, while retaining the same sensitivity as DIAMOND.For example, the latest assembly of the Iowa praire soil metagenomic dataset generates over 9 milllion of contigs, with a total size about 7 Gbp; when aligning these contigs to the protein database NCBI-nr, DIAMOND takes 4 to 5 days, and AC-DIAMOND takes about l day.
机译:迄今为止,要加快与蛋白质数据库的DNA读码或组装的重叠序列的比对一直是一个挑战。最新的工具DIAMOND大大提高了BLASTX和RAPSearch的速度,同时提供了相似的灵敏度。但是对于像宏基因组学这样涉及大量数据的应用程序,DIAMOND仍然需要很多时间。本文介绍了一种称为AC-DIAMOND的更快的蛋白质比对工具,该工具试图通过更好的SIMD并行化和更节省空间的参考数据库索引来加快DIAMOND的速度。后者允许将更多查询加载到内存中并一起处理。实验结果表明,AC-DIAMOND在对齐DNA读数或重叠群时比DIAMOND快约4倍,同时保持与DIAMOND相同的灵敏度。例如,最新的爱荷华州praire土壤宏基因组数据集可生成超过9百万个重叠群。总大小约为7 Gbp;将这些重叠群与蛋白质数据库NCBI-nr进行比对时,DIAMOND需要4至5天,而AC-DIAMOND需要大约1天。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号