...
首页> 外文期刊>Journal of Integrative Bioinformatics >Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows
【24h】

Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows

机译:数据分区支持在基因组规模的工作流程中使用标准的SOAP Web服务

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.
机译:生物数据库和计算生物学工具由世界各地的研究小组提供,并可以在Web上访问。组合这些资源是生物信息学中的一种常见做法,但是集成异构且通常是分布式的工具和数据集可能会面临挑战。迄今为止,乏味且易于出错的脚本通常以务实的方式解决了这一挑战。然而,最近已经确定并提出了一种更可靠的技术作为将生物信息学资源(即Web服务)联系在一起的平台。在过去的十年中,Web服务已在生物信息学中广泛传播,并获得了推荐技术的称号。但是,在高通量实验时代,有关Web服务的一个主要问题是它们处理大规模数据流量的能力。我们为标准SOAP Web服务提出了一种类似于流的通信模式,该模式可在工作流程协调器和Web服务之间实现大数据流量的高效流动。我们通过将其与示例管道上用于基因组序列注释的典型通信模式进行比较来评估数据分区策略。结果表明,数据分区可以降低服务的资源需求并提高其吞吐量,从而可以使用标准的SOAP Web服务和工作流在基因组规模上进行计算机内实验。作为原理证明,我们使用普通的BPEL工作流引擎注释了RNA-seq数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号