首页> 外文会议>IEEE/NIH Life Science Systems and Applications Workshop >High-throughput computation of pairwise sequence similarities for multiple genome comparisons using ScalaBLAST
【24h】

High-throughput computation of pairwise sequence similarities for multiple genome comparisons using ScalaBLAST

机译:使用Scalablast的多个基因组比较的成对序列相似性的高吞吐量计算

获取原文

摘要

Genome sequence comparisons of exponentially growing data sets form the foundation for the comparative analysis tools provided by community biological data resources such as the Integrated Microbial Genome (IMG) system at the Joint Genome Institute (JGI). For a genome sequencing center to provide multiple-genome comparison capabilities, it must keep pace with exponentially growing collection of sequence data, both from its own genomes, and from public genomes. We present an example of how ScalaBLAST, a high-throughput sequence analysis program, harnesses increasingly critical high-performance computing to perform sequence analysis, enabling, for example, all vs. all BLAST runs across 2 million protein sequences within a day using thousands of processors as opposed to conventional comparison methods that would take years to complete.
机译:基因组序列比较指数越来越多的数据集形成了由社区生物数据资源如联合基因组研究所(JGI)的集成微生物基因组(IMG)系统提供的对比分析工具的基础。对于基因组测序中心提供多基因组比较能力,必须与其自身基因组和公共基因组的序列数据的序列数据集合一致。我们展示了缩放,高吞吐量序列分析程序,利用越来越关键的高性能计算的示例,以执行序列分析,例如,所有与所有爆炸在使用成千上万的一天内跨越200万蛋白序列的爆炸运行。处理器与传统比较方法相反,需要多年来完成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号