...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Model-based clustering with gene ranking using penalized mixtures of heavy-tailed distributions
【24h】

Model-based clustering with gene ranking using penalized mixtures of heavy-tailed distributions

机译:使用重尾分布的惩罚混合物基于模型的基因排序聚类

获取原文
获取原文并翻译 | 示例

摘要

Cluster analysis of biological samples using gene expression measurements is a common task which aids the discovery of heterogeneous biological sub-populations having distinct mRNA profiles. Several model-based clustering algorithms have been proposed in which the distribution of gene expression values within each sub-group is assumed to be Gaussian. In the presence of noise and extreme observations, a mixture of Gaussian densities may over-fit and overestimate the true number of clusters. Moreover, commonly used model-based clustering algorithms do not generally provide a mechanism to quantify the relative contribution of each gene to the final partitioning of the data. We propose a penalized mixture of Student's t distributions for model-based clustering and gene ranking. Together with a resampling procedure, the proposed approach provides a means for ranking genes according to their contributions to the clustering process. Experimental results show that the algorithm performs well comparably to traditional Gaussian mixtures in the presence of outliers and longer tailed distributions. The algorithm also identifies the true informative genes with high sensitivity, and achieves improved model selection. An illustrative application to breast cancer data is also presented which confirms established tumor sub-classes.
机译:使用基因表达测量结果对生物样品进行聚类分析是一项常见任务,有助于发现具有不同mRNA谱的异质生物亚群。已经提出了几种基于模型的聚类算法,其中每个子组中基因表达值的分布被假定为高斯分布。在存在噪声和极端观测的情况下,混合的高斯密度可能会过度拟合并高估了群集的真实数量。此外,常用的基于模型的聚类算法通常不提供量化每个基因对数据最终划分的相对贡献的机制。我们提出了基于模型的聚类和基因排名的学生t分布的混合惩罚。连同重新采样程序,所提出的方法提供了一种根据基因对聚类过程的贡献对基因进行排名的方法。实验结果表明,在存在异常值和较长尾部分布的情况下,该算法与传统的高斯混合算法相比具有良好的性能。该算法还以高灵敏度识别真正的信息基因,并实现了改进的模型选择。还提出了对乳腺癌数据的说明性应用,其证实了已建立的肿瘤亚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号