首页> 外文期刊>Future generation computer systems >Cloud MapReduce for Monte Carlo bootstrap applied to Metabolic Flux Analysis
【24h】

Cloud MapReduce for Monte Carlo bootstrap applied to Metabolic Flux Analysis

机译:蒙特卡洛引导程序的Cloud MapReduce在代谢通量分析中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

The MapReduce architectural pattern popularized by Google has successfully been utilized in several scientific applications. Up until now, MapReduce is rarely employed in the field of Systems Biology. We investigate whether a MapReduce approach utilizing on-demand resources from a Cloud is suitable to perform simulation tasks in the area of Metabolic Flux Analysis (MFA). An Amazon ElasticMapReduce Cloud implementation of the parallel, parametric Monte Carlo bootstrap in the context to ~(13)C-MFA is presented. The seamless integration of the application into a service-oriented, BPEL-based scientific workflow framework is shown. A comparison of a straightforward MapReduce implementation using the Hadoop streaming interface on various Amazon ElasticMapReduce instance types and a single CPU core computation approach reveals a speedup of 17 on 64 Amazon cores. I/O operations on many small files within the Reduce step were identified as the limiting step. By exploiting the Hadoop Java API, making use of built-in data types and tuning problem-specific Hadoop parameters, the I/O issues could be resolved. With the revised implementation, a speedup of up to 48 could be achieved on 64 Amazon cores. To investigate the runtimes of a realistic ~(13)C-MFA analysis, 50,000 Monte Carlo samples with a typical metabolic network model have been performed on 20 virtual nodes in 24 h and 23 min with a total cost of $384. Our work demonstrates the possibility to perform scalable Systems Biology applications using Amazon's Cloud MapReduce service.
机译:由Google推广的MapReduce体系结构模式已成功地用于多种科学应用中。到目前为止,MapReduce很少用于系统生物学领域。我们研究了利用来自云的按需资源的MapReduce方法是否适合执行代谢通量分析(MFA)领域的模拟任务。在〜(13)C-MFA的上下文中,提出了并行,参数化的蒙特卡洛引导程序的Amazon ElasticMapReduce Cloud实现。显示了将应用程序无缝集成到面向服务的,基于BPEL的科学工作流框架中。在各种Amazon ElasticMapReduce实例类型上使用Hadoop流接口的简单MapReduce实施方案与单个CPU核心计算方法的比较显示,在64个Amazon核心上可提高17倍。在减少步骤中对许多小文件的I / O操作被确定为限制步骤。通过利用Hadoop Java API,利用内置数据类型和调整问题特定的Hadoop参数,可以解决I / O问题。通过修订后的实施,可以在64个Amazon内核上实现高达48的加速。为了研究逼真的〜(13)C-MFA分析的运行时间,已在24小时和23分钟内对20个虚拟节点执行了50,000个具有典型代谢网络模型的蒙特卡洛样本,总成本为384美元。我们的工作证明了使用Amazon的Cloud MapReduce服务执行可伸缩的System Biology应用程序的可能性。

著录项

  • 来源
    《Future generation computer systems》 |2013年第2期|582-590|共9页
  • 作者单位

    Institute of Bio- and Geosciences 1: Biotechnology 2, Forschungszentrum Juelich, Wilhelm-Johnen-Strasse, D-52428 Juelich, Germany;

    Department of Mathematics & Computer Science and Center for Synthetic Microbiology, University of Marburg, Hans-Meerwein-Strasse 3, D-35032 Marburg, Germany;

    Department of Mathematics & Computer Science and Center for Synthetic Microbiology, University of Marburg, Hans-Meerwein-Strasse 3, D-35032 Marburg, Germany;

    Institute of Bio- and Geosciences 1: Biotechnology 2, Forschungszentrum Juelich, Wilhelm-Johnen-Strasse, D-52428 Juelich, Germany;

    Institute of Bio- and Geosciences 1: Biotechnology 2, Forschungszentrum Juelich, Wilhelm-Johnen-Strasse, D-52428 Juelich, Germany;

    Institute of Bio- and Geosciences 1: Biotechnology 2, Forschungszentrum Juelich, Wilhelm-Johnen-Strasse, D-52428 Juelich, Germany;

    Department of Mathematics & Computer Science and Center for Synthetic Microbiology, University of Marburg, Hans-Meerwein-Strasse 3, D-35032 Marburg, Germany;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    metabolic flux analysis; cloud computing; scientific workflows; hadoop; MapReduce; monte carlo bootstrap;

    机译:代谢通量分析;云计算;科学的工作流程;Hadoop MapReduce;蒙特卡洛靴带;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号