Workflows for microarray data processing in the Kepler environment

Thomas Stropp; Timothy McPhillips; Bertram Lud?scher; Mark Bieda

首页> 外文期刊>BMC Bioinformatics >Workflows for microarray data processing in the Kepler environment

【24h】

Workflows for microarray data processing in the Kepler environment

机译：开普勒环境中微阵列数据处理的工作流程

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. Conclusions We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.

机译：背景技术由于微阵列数据分析的复杂性，每个分析步骤中几种选择的可用性以及对新分析需求的开发，包括与新数据源的集成，微阵列数据分析已成为广泛且正在进行的管道开发的主题。生物信息学管道通常是为不同的应用程序定制的，因此通常很难对其进行修改，扩展和重新利用。科学工作流系统旨在通过提供通用框架来解决这些问题，在通用框架中开发和执行此类管道。开普勒工作流环境是一个持续发展的完善系统，已在多个科学研究领域中使用。开普勒提供灵活的图形界面，可清晰显示参数值，用于设计和修改工作流程。它具有使用R，Python和Java编程语言开发新颖的计算组件的功能，所有这些功能都广泛用于生物信息学算法开发，以及调用外部应用程序和使用Web服务的功能。结果我们开发了一系列功能齐全的生物信息学管道，以解决开普勒工作流程环境中微阵列处理中的常见任务。这些管道包括一组用于在微阵列（ChIP芯片）数据集上对NimbleGen染色质免疫沉淀进行GFF文件处理的工具，以及用于Affymetrix基因表达微阵列生物信息学和PCR实验的基本引物设计的更全面的工作流程，通常用于验证微阵列结果。尽管这些工作流本身具有功能，但可以轻松地对其进行自定义，扩展或调整用途，以适应特定项目的需求，并被设计为特定应用程序的工具箱和起点。这些工作流程说明了专注于本地资源（程序和数据）的工作流程编程范例，因此与管道设计的传统外壳脚本或R / BioConductor脚本方法非常接近。最后，我们建议微阵列数据处理任务工作流可能为将来不同工作流系统的基于示例的比较提供基础。结论我们为开普勒环境中的微阵列数据分析提供了一套工具和完整的工作流程，其优点是可以图形化，清晰地显示概念步骤和参数，并具有轻松集成其他资源（如远程数据和Web服务）的能力。

著录项

来源
《BMC Bioinformatics》 |2012年第1期|共页
作者
Thomas Stropp; Timothy McPhillips; Bertram Lud?scher; Mark Bieda;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物科学;
关键词

相似文献

外文文献
中文文献
专利

1. Using Kepler for Tool Integration in Microarray Analysis Workflows [J] . Zhuohui Gan, Jennifer C. Stowe, Ilkay Altintas, Procedia Computer Science . 2014,第1期

机译：在微阵列分析工作流程中使用Kepler进行工具集成
2. Workflows and extensions to the Kepler scientific workflow system to support environmental sensor data access and analysis [J] . Derik Barseghian, Ilkay Altintas, Matthew B. Jones, Ecological informatics: an international journal on ecoinformatics and computational ecology . 2010,第1期

机译：开普勒科学工作流程系统的工作流程和扩展，以支持环境传感器数据访问和分析
3. Concurrent and storage-aware data streaming for data processing workflows in grid environments [J] . Zhang Wen, Cao Junwei, Zhong Yisheng, Tsinghua Science and Technology . 2010,第3期

机译：并发和存储感知的数据流，用于网格环境中的数据处理工作流
4. Scientific Workflow Approach(Kepler)for Carbon flux data processing [C] . Min Liu, Honglin He, Xiaomin Sun, 2009 Second international conference on intelligent computation technology and automation . 2009

机译：科学工作流方法（Kepler）用于碳通量数据处理
5. Autonomic management of data streaming and in-transit processing for data intensive scientific workflows. [D] . Bhat, Viraj. 2008

机译：数据流的自主管理和数据密集型科学工作流的在途处理。
6. Workflows for microarray data processing in the Kepler environment [O] . Thomas Stropp, Timothy McPhillips, Bertram Ludäscher, 2012

机译：开普勒环境中微阵列数据处理的工作流程
7. Workflows for microarray data processing in the Kepler environment [O] . Thomas Stropp, Timothy McPhillips, Bertram Ludäscher, 2012

机译：开普勒环境中微阵列数据处理的工作流程

Workflows for microarray data processing in the Kepler environment

摘要

著录项

相似文献

相关主题

期刊订阅