首页> 美国卫生研究院文献>Journal of Integrative Bioinformatics >clubber: removing the bioinformatics bottleneck in big data analyses
【2h】

clubber: removing the bioinformatics bottleneck in big data analyses

机译:俱乐部:消除大数据分析中的生物信息学瓶颈

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment.
机译:随着现代高通量技术的出现,生物学发现的瓶颈已经从进行实验的成本转移到了分析结果的成本。 clubber是我们的自动化集群负载均衡系统,旨在优化这些“大数据”分析。它的即插即用框架鼓励重用现有解决方案来解决生物信息学问题。 clubber的目标是减少计算时间并促进集群计算的使用。第一个目标是通过在可用的高性能计算(HPC)资源之间自动化并行提交的平衡来实现的。值得注意的是,后者可以按需添加,包括基于云的资源和/或具有异构环境。交互式Web界面和RESTful API有助于实现HPC用户友好的第二个目标,该API允许作业监视和结果检索。我们使用clubber加快了注释元基因组分子功能的流程。在这里,我们分析了“深水地平线”溢油研究数据,定量地显示了沙滩沙尚未完全恢复。此外,我们对CAMI挑战数据的分析表明,微生物组分类学变化不一定与功能变化相关。这些例子(在172分钟内处理了21个元基因组)清楚地说明了Clubber在日常计算生物学环境中的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号