首页> 外文OA文献 >Concevoir et partagerdes workflowsd’analyse de données.Application auxtraitements intensifsen bioinformatique
【2h】

Concevoir et partagerdes workflowsd’analyse de données.Application auxtraitements intensifsen bioinformatique

机译:设计和共享数据分析工作流,应用于密集的生物信息学处理

摘要

Design and share data analysis workflows. Application to bioinformatics intensivetreatmentsAs part of an Open Science initiative, we are particularly interested in the scientificWorkflow Management Systems (WfMS) and their applications for intensive data analysisin bioinformatics. We start from the assumption that WfMS can evolve to becomeefficient hubs able to speed up the development and the dissemination of innovativeanalysis methods. These software platforms could rally and unite not only the currentstakeholders, who are service consumers, but also the service producers, around a disciplinarytheme. We therefore consider that these environments must be both adapted tothe practices of the scientists who are method designers and also enhanced with increasedproductivity during design and treatment. These constraints lead us to study the rapidcapture of workflows, the simplification of technical tasks integration, like parallelisationand the deployment customization. First, we define an expressive graphic worfklowlanguage, adapted to the quick capture of workflows. This is interpreted by a workflowengine based on a new model of computation with high performances obtained by theuse of multiple levels of parallelism. Then, we present a Model-Driven design approachthat facilitates the data parallelism generation and the production of suitable implementationsfor different execution contexts. We describe in particular the integration of acomponents and platforms meta-model used to automate the configuration of workflows’dependencies. Finally, in the case of the cloud model Container as a Service (CaaS), wedevelop a workflow specification intrinsically re-executable and readily disseminatable.The adoption of this kind of model could lead to an acceleration of exchanges and abetter availability of data analysis workflows.
机译:设计和共享数据分析工作流程。在生物信息学强化治疗中的应用作为开放科学计划的一部分,我们对科学工作流程管理系统(WfMS)及其在生物信息学中进行密集数据分析的应用特别感兴趣。我们从这样一个假设开始,即WfMS可以发展成为能够加快创新分析方法的开发和传播的高效枢纽。这些软件平台可以围绕学科主题集会和团结,不仅是服务消费者的当前利益相关者,而且是服务生产者。因此,我们认为这些环境必须既适合作为方法设计者的科学家的实践,又必须在设计和处理过程中提高生产率。这些限制导致我们研究工作流的快速捕获,技术任务集成的简化(例如并行化和部署定制)。首先,我们定义一种表现力的图形语言,以适应工作流的快速捕获。这是由工作流引擎基于新的计算模型来解释的,该模型具有通过使用多个并行级别而获得的高性能。然后,我们提出了一种模型驱动的设计方法,该方法有助于数据并行性的生成以及针对不同执行上下文的合适实现的产生。我们特别描述了用于自动配置工作流程依赖性的组件和平台元模型的集成。最后,在云模型容器即服务(CaaS)的情况下,我们开发了本质上可重新执行且易于分散的工作流规范,采用这种模型可能会导致交换加速和数据分析工作流更好的可用性。

著录项

  • 作者

    Moreews Francois;

  • 作者单位
  • 年度 2015
  • 总页数
  • 原文格式 PDF
  • 正文语种 fr
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号