首页> 外文期刊>BMC Bioinformatics >A formal concept analysis approach to consensus clustering of multi-experiment expression data
【24h】

A formal concept analysis approach to consensus clustering of multi-experiment expression data

机译:一种用于多实验表达数据的共识聚类的形式化概念分析方法

获取原文
           

摘要

Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. Conclusions The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices.
机译:背景技术目前,随着可用基因表达数据集的数量和复杂性的增加,来自解决类似生物学问题的多个微阵列研究的数据的组合变得越来越重要。多个数据集的分析和集成有望产生更可靠,更可靠的结果,因为它们基于大量样本,并且各个研究特定偏见的影响也有所减少。最近的研究支持了这一点,表明重要的生物学信号通常可以通过多次实验来保留或增强。组合来自不同实验的数据的一种方法是将它们的聚类聚合为一个共识性或代表性的聚类解决方案,这增加了对所有数据集共同特征的置信度,并揭示了它们之间的重要差异。结果我们提出了一种新颖的通用共识聚类技术,该技术应用形式概念分析(FCA)方法对来自多个微阵列数据集的聚类解决方案进行合并和分析。这些数据集最初根据预定义的标准分为相关实验组。随后,将共识性聚类算法应用于每个组,从而得出每个组的聚类解决方案。这些解决方案汇集在一起​​,并通过采用FCA进行进一步分析,这允许从数据中提取有价值的见解并在所有实验中生成基因分区。为了验证FCA增强方法,两种共识性聚类算法适用于合并FCA分析。根据多项试验研究的基因表达数据对它们的性能进行了评估,该研究检查了裂变酵母的总体细胞周期控制。从这两种方法得出的FCA结果表明,尽管这两种算法都优化了不同的聚类特征,但FCA能够克服和减少这些差异并保留一些相关的生物学信号。结论提出的FCA增强共识聚类技术是将聚类算法与FCA相结合的通用方法,用于从多个基因表达矩阵中得出聚类解决方案。本文介绍的实验结果表明,这是一种可靠的数据集成技术,能够产生代表整个表达矩阵集的高质量聚类解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号