Bioinformatics Workflows With NoSQL Database in Cloud Computing

机译：NoSQL数据库在云计算中的生物信息学工作流程

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scientific workflows can be understood as arrangements of managed activities executed by different processing entities. It is a regular Bioinformatics approach applying workflows to solve problems in Molecular Biology, notably those related to sequence analyses. Due to the nature of the raw data and the in silico environment of Molecular Biology experiments, apart from the research subject, 2 practical and closely related problems have been studied: reproducibility and computational environment. When aiming to enhance the reproducibility of Bioinformatics experiments, various aspects should be considered. The reproducibility requirements comprise the data provenance, which enables the acquisition of knowledge about the trajectory of data over a defined workflow, the settings of the programs, and the entire computational environment. Cloud computing is a booming alternative that can provide this computational environment, hiding technical details, and delivering a more affordable, accessible, and configurable on-demand environment for researchers. Considering this specific scenario, we proposed a solution to improve the reproducibility of Bioinformatics workflows in a cloud computing environment using both Infrastructure as a Service (IaaS) and Not only SQL (NoSQL) database systems. To meet the goal, we have built 3 typical Bioinformatics workflows and ran them on 1 private and 2 public clouds, using different types of NoSQL database systems to persist the provenance data according to the Provenance Data Model (PROV-DM). We present here the results and a guide for the deployment of a cloud environment for Bioinformatics exploring the characteristics of various NoSQL database systems to persist provenance data.

机译：科学工作流可以理解为由不同处理实体执行的托管活动的安排。这是一种常规的生物信息学方法，将工作流应用于解决分子生物学中的问题，尤其是与序列分析有关的问题。由于原始数据的性质和分子生物学实验的计算机环境，除了研究主题之外，还研究了两个实际且密切相关的问题：可再现性和计算环境。当旨在提高生物信息学实验的可重复性时，应考虑各个方面。可重复性要求包括数据来源，该来源可以在定义的工作流程，程序设置以及整个计算环境中获取有关数据轨迹的知识。云计算是一种蓬勃发展的替代方案，可以提供这种计算环境，隐藏技术细节并为研究人员提供更实惠，更易访问且可配置的按需环境。考虑到此特定方案，我们提出了一种解决方案，该解决方案使用基础结构即服务（IaaS）和SQL（NoSQL）数据库系统，在云计算环境中提高生物信息学工作流的可重复性。为了实现该目标，我们建立了3种典型的生物信息学工作流程，并在1种私有云和2种公共云上运行它们，并使用不同类型的NoSQL数据库系统根据出处数据模型（PROV-DM）来保存出处数据。我们在这里介绍了结果，并为生物信息学云环境的部署提供了指南，以探索各种NoSQL数据库系统的特性来持久保存来源数据。

著录项

期刊名称 Evolutionary Bioinformatics Online
作者
Polyane Wercelens; Waldeyr da Silva; Fernanda Hondo; Klayton Castro; Maria Emília Walter; Aletéia Araújo; Sergio Lifschitz; Maristela Holanda;
展开▼
作者单位

展开▼
年(卷),期 2019(15),-1
年度 2019
页码 -1
总页数 11
原文格式 PDF
正文语种
中图分类进化论、生物系统发育;
关键词
Bioinformatics workflows; reproducibility; data provenance; cloud computing; NoSQL;

机译：生物信息学工作流程;可重复性;数据来源;云计算;NoSQL;
入库时间 2022-08-21 11:38:18

相似文献

外文文献
中文文献
专利

1. Bioinformatics Workflows With NoSQL Database in Cloud Computing [J] . Polyane Wercelens, Waldeyr da Silva, Fernanda Hondo, Evolutionary Bioinformatics . 2019,第a期

机译：生物信息学用NoSQL数据库在云计算中的工作流程
2. Implementation of Secondary Index on Cloud Computing NoSQL Database in Big Data Environment [J] . Bao RongChang, Hsiu-FenTsai, Chia-YenChen, Scientific programming . 2015,第4期

机译：大数据环境中云计算NoSQL数据库二级索引的实现
3. Implementation of Secondary Index on Cloud Computing NoSQL Database in Big Data Environment [J] . Chang Bao Rong, Tsai Hsiu-Fen, Chen Chia-Yen, Scientific programming . 2015,第期

机译：大数据环境中云计算NoSQL数据库二级索引的实现
4. Data provenance management for bioinformatics workflows using NoSQL database systems in a cloud computing environment [C] . Fernanda Hondo, Polyane Wercelens, Waldeyr da Silva, IEEE International Conference on Bioinformatics and Biomedicine . 2017

机译：在云计算环境中使用NoSQL数据库系统对生物信息学工作流程进行数据源管理
5. Dynamic Multi-objective Workflow Scheduling in Cloud Computing [D] . Ismay?lov, Goshgar. 2019

机译：云计算中的动态多目标工作流程调度
6. Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research [O] . José Raniery Ferreira Junior, Marcelo Costa Oliveira, Paulo Mazzoncini de Azevedo-Marques 2016

机译：基于云的NoSQL肺结节开放数据库用于计算机辅助肺癌诊断和可重复性研究
7. NoSQL DATABASES: NEW MILLENNIUM DATABASE FOR BIG DATA, BIG USERS, CLOUD COMPUTING AND ITS SECURITY CHALLENGES [O] . Asadulla Khan Zaki 2015

机译：NosQL DaTaBasEs：针对大数据，大型用户，云计算及其安全挑战的新千年数据库

Bioinformatics Workflows With NoSQL Database in Cloud Computing

摘要

著录项

相似文献

相关主题

期刊订阅