首页> 外文会议>2012 International Conference on High Performance Computing amp; Simulation >An energy-aware bioinformatics application for assembling short reads in high performance computing systems
【24h】

An energy-aware bioinformatics application for assembling short reads in high performance computing systems

机译:一种能源敏感型生物信息学应用程序,用于在高性能计算系统中组装短读

获取原文
获取原文并翻译 | 示例

摘要

Current biomedical technologies are producing massive amounts of data on an unprecedented scale. The increasing complexity and growth rate of biological data has made bioinformatics data processing and analysis a key and computationally intensive task. High performance computing (HPC) has been successfully applied to major bioinformatics applications to reduce computational burden. However, a naïve approach for developing parallel bioinformatics applications may achieve a high degree of parallelism while unnecessarily expending computational resources and consuming high levels of energy. As the wealth of biological data and associated computational burden continues to increase, there has become a need for the development of energy efficient computational approaches in the bioinformatics domain. To address this issue, we have developed an energy-aware scheduling (EAS) model to run computationally intensive applications that takes both deadline requirements and energy factors into consideration. An example of a computationally demanding process that would benefit from our scheduling model is the assembly of short sequencing reads produced by next generation sequencing technologies. Next generation sequencing produces a very large number of short DNA reads from a biological sample. Multiple overlapping fragments must be aligned and merged into long stretches of contiguous sequence before any useful information can be gathered. The assembly problem is extremely difficult due to the complex nature of underlying genome structure and inherent biological error present in current sequencing technologies. We apply our EAS model to a newly proposed assembly algorithm called Merge and Traverse, giving us the ability to generate speedup profiles. Our EAS model was also able to dynamically adjust the number of nodes needed to meet given deadlines for different sets of reads.
机译:当前的生物医学技术正在以前所未有的规模产生大量数据。生物数据的复杂性和增长率不断提高,已使生物信息学数据处理和分析成为一项关键且计算量大的任务。高性能计算(HPC)已成功应用于主要的生物信息学应用程序,以减少计算负担。但是,开发并行生物信息学应用程序的幼稚方法可以实现高度并行性,同时不必要地消耗计算资源并消耗大量能量。随着生物数据的财富和相关的计算负担继续增加,已经需要在生物信息学领域中开发节能的计算方法。为了解决此问题,我们开发了一种能源感知调度(EAS)模型来运行计算密集型应用程序,该模型同时考虑了期限要求和能源因素。将从我们的调度模型中受益的对计算要求很高的过程的一个示例是由下一代测序技术产生的短测序读段的组装。下一代测序可从生物样品中产生大量的短DNA读数。必须先将多个重叠的片段对齐并合并成较长的连续序列,然后才能收集任何有用的信息。由于基础基因组结构的复杂性和当前测序技术中存在的固有生物学错误,组装问题非常困难。我们将EAS模型应用于新提出的称为Merge and Traverse的组装算法,从而使我们能够生成加速曲线。我们的EAS模型还能够动态调整为满足不同读取集的给定期限而需要的节点数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号