...
首页> 外文期刊>BMC Bioinformatics >FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data
【24h】

FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data

机译:FLAME,一种用于DNA芯片数据分析的新型模糊聚类方法

获取原文

摘要

Background Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. To this aim, existing clustering approaches, mainly developed in computer science, have been adapted to microarray data analysis. However, previous studies revealed that microarray datasets have very diverse structures, some of which may not be correctly captured by current clustering methods. We therefore approached the problem from a new starting point, and developed a clustering algorithm designed to capture dataset-specific structures at the beginning of the process. Results The clustering algorithm is named Fuzzy clustering by Local Approximation of MEmbership (FLAME). Distinctive elements of FLAME are: (i) definition of the neighborhood of each object (gene or sample) and identification of objects with "archetypal" features named Cluster Supporting Objects, around which to construct the clusters; (ii) assignment to each object of a fuzzy membership vector approximated from the memberships of its neighboring objects, by an iterative converging process in which membership spreads from the Cluster Supporting Objects through their neighbors. Comparative analysis with K-means, hierarchical, fuzzy C-means and fuzzy self-organizing maps (SOM) showed that data partitions generated by FLAME are not superimposable to those of other methods and, although different types of datasets are better partitioned by different algorithms, FLAME displays the best overall performance. FLAME is implemented, together with all the above-mentioned algorithms, in a C++ software with graphical interface for Linux and Windows, capable of handling very large datasets, named Gene Expression Data Analysis Studio (GEDAS), freely available under GNU General Public License. Conclusion The FLAME algorithm has intrinsic advantages, such as the ability to capture non-linear relationships and non-globular clusters, the automated definition of the number of clusters, and the identification of cluster outliers, i.e. genes that are not assigned to any cluster. As a result, clusters are more internally homogeneous and more diverse from each other, and provide better partitioning of biological functions. The clustering algorithm can be easily extended to applications different from gene expression analysis.
机译:背景技术数据聚类分析已广泛应用于从用DNA微阵列获得的基因表达谱中提取信息。为了这个目的,主要在计算机科学中发展的现有聚类方法已经适应于微阵列数据分析。但是,先前的研究表明,微阵列数据集具有非常多样化的结构,其中某些可能无法通过当前的聚类方法正确捕获。因此,我们从新的起点着手解决此问题,并开发了一种聚类算法,旨在在流程开始时捕获特定于数据集的结构。结果该聚类算法被称为成员局部近似(FLAME)的模糊聚类。 FLAME的独特元素是:(i)定义每个对象(基因或样本)的邻域,并使用名为“群集支持对象”的“原型”特征识别对象,并在其周围构建群集; (ii)通过迭代收敛过程将模糊隶属度向量分配给每个对象,该模糊隶属度向量从其邻近对象的隶属度开始,在此过程中,隶属度从群集支持对象通过其邻居扩展。与K均值,分层,模糊C均值和模糊自组织图(SOM)的比较分析表明,虽然不同类型的数据集可以通过不同的算法更好地划分,但FLAME生成的数据分区不能与其他方法叠加。 ,FLAME显示最佳的整体性能。 FLAME与所有上述算法一起在具有用于Linux和Windows的图形界面的C ++软件中实现,该软件能够处理名为Gene Expression Data Analysis Studio(GEDAS)的非常大的数据集,可根据GNU通用公共许可免费获得。结论FLAME算法具有内在的优势,例如能够捕获非线性关系和非球状聚类,自动定义聚类数量以及识别聚类离群值,即未分配给任何聚类的基因。结果,簇在内部更均一并且彼此更多样化,并且提供了更好的生物学功能分配。聚类算法可以轻松地扩展到与基因表达分析不同的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号