...
首页> 外文期刊>Journal of Cheminformatics >The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
【24h】

The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

机译:用于生物数据的聚类和可视化的BioDICE Taverna插件:分子化合物探索的工作流程

获取原文

摘要

Background In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
机译:背景技术在许多实验管道中,多维生物学数据集的聚类用于检测未标记输入数据中的隐藏结构。 Taverna是一种流行的工作流程管理系统,用于设计和执行科学工作流程并帮助进行计算机实验。 Taverna平台中用于集群化和可视化的快速无监督方法的可用性对于支持复杂和探索性生物信息学应用程序中的数据驱动科学发现非常重要。结果这项工作提出了一个Taverna插件,即生物数据交互式聚类资源管理器(BioDICE),它可以对高维生物数据进行聚类,并为输入数据及其相似性的可视化提供非线性的拓扑保留投影。 BioDICE插件中的核心算法是快速学习自组织图(FLSOM),它是自组织图(SOM)算法的改进版本。该插件生成一个交互式2D地图,该地图可以对多维数据进行可视化探索,并可以识别相似对象的组。在与化学化合物有关的案例研究中证明了该插件的有效性。结论可用工具的数量和种类及其可扩展性使Taverna成为开发科学数据工作流的流行选择。这项工作提出了一个新颖的插件BioDICE,它向Taverna添加了一个数据驱动的知识发现组件。 BioDICE提供了有效而强大的聚类工具,可用于对生物数据集进行探索性分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号