...
首页> 外文期刊>BMC Bioinformatics >Workflows for microarray data processing in the Kepler environment
【24h】

Workflows for microarray data processing in the Kepler environment

机译:开普勒环境中微阵列数据处理的工作流程

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. Conclusions We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.
机译:背景技术由于微阵列数据分析的复杂性,每个分析步骤中几种选择的可用性以及对新分析需求的开发,包括与新数据源的集成,微阵列数据分析已成为广泛且正在进行的管道开发的主题。生物信息学管道通常是为不同的应用程序定制的,因此通常很难对其进行修改,扩展和重新利用。科学工作流系统旨在通过提供通用框架来解决这些问题,在通用框架中开发和执行此类管道。开普勒工作流环境是一个持续发展的完善系统,已在多个科学研究领域中使用。开普勒提供灵活的图形界面,可清晰显示参数值,用于设计和修改工作流程。它具有使用R,Python和Java编程语言开发新颖的计算组件的功能,所有这些功能都广泛用于生物信息学算法开发,以及调用外部应用程序和使用Web服务的功能。结果我们开发了一系列功能齐全的生物信息学管道,以解决开普勒工作流程环境中微阵列处理中的常见任务。这些管道包括一组用于在微阵列(ChIP芯片)数据集上对NimbleGen染色质免疫沉淀进行GFF文件处理的工具,以及用于Affymetrix基因表达微阵列生物信息学和PCR实验的基本引物设计的更全面的工作流程,通常用于验证微阵列结果。尽管这些工作流本身具有功能,但可以轻松地对其进行自定义,扩展或调整用途,以适应特定项目的需求,并被设计为特定应用程序的工具箱和起点。这些工作流程说明了专注于本地资源(程序和数据)的工作流程编程范例,因此与管道设计的传统外壳脚本或R / BioConductor脚本方法非常接近。最后,我们建议微阵列数据处理任务工作流可能为将来不同工作流系统的基于示例的比较提供基础。结论我们为开普勒环境中的微阵列数据分析提供了一套工具和完整的工作流程,其优点是可以图形化,清晰地显示概念步骤和参数,并具有轻松集成其他资源(如远程数据和Web服务)的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号