首页> 外文期刊>Journal of Visual Languages & Computing >Gene expression data clustering and visualization based on a binary hierarchical clustering framework
【24h】

Gene expression data clustering and visualization based on a binary hierarchical clustering framework

机译:基于二进制层次聚类框架的基因表达数据聚类和可视化

获取原文
获取原文并翻译 | 示例

摘要

Gene expression data analysis has recently emerged as an active area of research. An important tool for unsupervised analysis of gene expression data is cluster analysis. Although many clustering algorithms have been proposed for such task, problems such as estimating the right number of clusters and adapting to different cluster characteristics are still not satisfactorily addressed. In this paper, we propose a binary hierarchical clustering (BHC) algorithm for the clustering of gene expression data. The BHC algorithm involves two major steps: (ⅰ) the fuzzy C-means algorithm and the average linkage hierarchical clustering algorithm are used to partition the data into two classes, and (ⅱ) the Fisher linear discriminant analysis is applied to the two classes to refine and assess whether the partition is acceptable. The BHC algorithm recursively partitions the subclasses until all clusters cannot be partition any further. It does not require the number of clusters to be supplied in advance nor does it place any assumption about the size of each cluster or the class distribution. The BHC algorithm naturally leads to a tree structure representation, where the clustering results can be visualized easily.
机译:基因表达数据分析近来已成为活跃的研究领域。基因表达数据无监督分析的重要工具是聚类分析。尽管已经提出了许多用于这种任务的聚类算法,但是诸如满意地估计聚类的数目和适应不同的聚类特征的问题仍然不能令人满意地解决。在本文中,我们提出了一种用于基因表达数据聚类的二进制层次聚类(BHC)算法。 BHC算法涉及两个主要步骤:(ⅰ)使用模糊C均值算法和平均链接层次聚类算法将数据划分为两类,(ⅱ)将Fisher线性判别分析应用于这两个类以优化并评估该分区是否可接受。 BHC算法递归地划分子类,直到无法再对所有群集进行进一步划分。它不需要预先提供群集的数量,也不需要对每个群集的大小或类分布进行任何假设。 BHC算法自然会导致树状结构的表示,可以很容易地看到聚类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号