首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >Symposium Evening Tutorial: High-performance Computing Methods for Computational Genomics
【24h】

Symposium Evening Tutorial: High-performance Computing Methods for Computational Genomics

机译:研讨会夜间教程:用于计算基因组学的高性能计算方法

获取原文

摘要

As biomolecular sequence data continue to be amassed at unprecedented rates, the design of effective computational methods and capabilities that can derive biologically significant information from them has become both increasingly challenging and imperative. In this tutorial, the audience will be first introduced to the different types of biomolecular sequence data and the wealth of information they encode. Following this technical grounding, high-performance computing approaches developed to address some of the most computationally challenging problems in genomics will be described. The contents will be presented in three parts: (i) In the first part, we will describe methods that were designed to query a sequence against a large sequence database. Two popular parallel approaches, mpiBLAST and ScalaBLAST, implementing the NCBI BLAST suite of programs will be described. (ii) Next, we will describe PaCE, which is a parallel DNA sequence clustering algorithm. As direct applications, we will discuss the clustering of large-scale Expressed Sequence Tag data and the assembly of complex genomes. (iii) Finally, we describe GRAPPA, which is a high-performance software suite developed for phylogenetic reconstruction of a collection of genomes or genes. Throughout the tutorial, emphasis will be on both scalability and effectiveness in exploiting large-scale state-of-the-art supercomputing technologies. The intended audience are academic and industry researchers, educators, and/or commercial application developers, with a computational background. No background in biology is assumed.
机译:由于生物分子序列数据继续处于前所未有的速率,因此设计有效计算方法和能力,可以从他们中获得生物学上重要信息已经变得越来越具有挑战性和必要。在本教程中,将首先将观众引入不同类型的生物分子序列数据和它们编码的大量信息。在这种技术接地之后,将描述为解决基因组学中一些最具计算挑战性问题的高性能计算方法。内容将以三个部分呈现:(i)在第一部分中,我们将描述旨在对大型序列数据库查询序列的方法。将描述两个流行的并行方法,MPIBLAST和Scalablast,实现NCBI Blast套件的程序。 (ii)接下来,我们将描述速度,即是一个并行DNA序列聚类算法。作为直接应用程序,我们将讨论大规模表达序列标签数据的聚类和复杂基因组的组装。 (iii)最后,我们描述了Grappa,这是一种高性能的软件套件,用于为系统发育重建的基因组或基因集合。在整个教程中,强调在利用大规模最先进的超级计算技术方面将介绍可扩展性和有效性。预定的受众是学术界和行业研究人员,教育工作者和/或商业应用程序开发人员,其中有一个计算背景。假设生物学中没有背景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号