首页> 美国卫生研究院文献>other >Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

【2h】

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

机译：基于惩罚模型的聚类特定对角协方差矩阵和分组变量

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying clustering structures. Hence removing noise variables via variable selection is necessary. For simultaneous variable selection and parameter estimation, existing penalized likelihood approaches in model-based clustering analysis all assume a common diagonal covariance matrix across clusters, which however may not hold in practice. To analyze high-dimensional data, particularly those with relatively low sample sizes, this article introduces a novel approach that shrinks the variances together with means, in a more general situation with cluster-specific (diagonal) covariance matrices. Furthermore, selection of grouped variables via inclusion or exclusion of a group of variables altogether is permitted by a specific form of penalty, which facilitates incorporating subject-matter knowledge, such as gene functions in clustering microarray samples for disease subtype discovery. For implementation, EM algorithms are derived for parameter estimation, in which the M-steps clearly demonstrate the effects of shrinkage and thresholding. Numerical examples, including an application to acute leukemia subtype discovery with microarray gene expression data, are provided to demonstrate the utility and advantage of the proposed method.

机译：聚类分析是许多新兴领域（例如微阵列数据分析）中使用最广泛的统计工具之一。对于微阵列和其他高维数据，许多噪声变量的存在可能掩盖了潜在的聚类结构。因此，必须通过变量选择来消除噪声变量。对于同时进行的变量选择和参数估计，基于模型的聚类分析中现有的惩罚似然方法都假设跨聚类使用共同的对角协方差矩阵，但实际上可能不成立。为了分析高维数据，尤其是样本量相对较小的数据，本文介绍了一种新颖的方法，该方法在具有簇特定（对角）协方差矩阵的更一般情况下，将方差和均值缩小。此外，通过惩罚的特定形式允许通过完全包括或排除一组变量来选择分组变量，这有利于纳入主题知识，例如将基因功能整合到用于疾病亚型发现的微阵列样品的聚类中。为了实现，推导了EM算法进行参数估计，其中M步清楚地表明了收缩和阈值的影响。提供了数值示例，包括应用微阵列基因表达数据应用于急性白血病亚型的发现，以证明该方法的实用性和优势。

著录项

期刊名称 other
作者
Benhuai Xie; Wei Pan; Xiaotong Shen;
展开▼
作者单位

展开▼
年(卷),期 -1(2),-1
年度 -1
页码 168–212
总页数 49
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Model-based clustering with sparse covariance matrices [J] . Fop Michael, Murphy Thomas Brendan, Scrucca Luca Statistics and computing . 2019,第4期

机译：稀疏协方差矩阵的基于模型的聚类
2. Model-based clustering with sparse covariance matrices [J] . Fop Michael, Murphy Thomas Brendan, Scrucca Luca Statistics and computing . 2019,第4期

机译：基于模型的群体与稀疏协方差矩阵
3. Variable selection in penalized model-based clustering via regularization on grouped parameters. [J] . Xie B, Pan W, Shen X Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2008,第3期

机译：通过对分组参数进行正则化，在基于惩罚模型的聚类中选择变量。
4. Online Profiling for cluster-specific variable rate refreshing in high-density DRAM systems [C] . Rasool Sharifi, Zainalabedin Navabi IEEE European Test Symposium . 2017

机译：在线分析，用于高密度DRAM系统中特定于群集的可变速率刷新
5. Approximating covariance matrices using low rank perturbations with applications to accent identification and social network clustering . [D] . Purnell, Jonathan. 2010

机译：利用低秩摄动近似协方差矩阵及其在重音识别和社交网络聚类中的应用。
6. Penalized model-based clustering with unconstrained covariance matrices [O] . Hui Zhou, Wei Pan, Xiaotong Shen -1

机译：基于惩罚模型的无约束协方差矩阵聚类
7. Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables [O] . Xie, Benhuai, Pan, Wei, Shen, Xiaotong 2008

机译：基于模型的惩罚聚类，具有特定于簇的对角线协方差矩阵和分组变量

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

摘要

著录项

相似文献

相关主题

期刊订阅