A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model

Isabella Morlini

首页> 外文期刊>Advances in Data Analysis and Classification >A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model

【24h】

A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model

机译：一种在高斯混合模型中对混合二进制和连续变量进行聚类的潜在变量方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

For clustering objects, we often collect not only continuous variables, but binary attributes as well. This paper proposes a model-based clustering approach with mixed binary and continuous variables where each binary attribute is generated by a latent continuous variable that is dichotomized with a suitable threshold value, and where the scores of the latent variables are estimated from the binary data. In economics, such variables are called utility functions and the assumption is that the binary attributes (the presence or the absence of a public service or utility) are determined by low and high values of these functions. In genetics, the latent response is interpreted as the ‘liability’ to develop a qualitative trait or phenotype. The estimated scores of the latent variables, together with the observed continuous ones, allow to use a multivariate Gaussian mixture model for clustering, instead of using a mixture of discrete and continuous distributions. After describing the method, this paper presents the results of both simulated and real-case data and compares the performances of the multivariate Gaussian mixture model and of a mixture of joint multivariate and multinomial distributions. Results show that the former model outperforms the mixture model for variables with different scales, both in terms of classification error rate and reproduction of the clusters means.

机译：对于群集对象，我们通常不仅收集连续变量，而且还收集二进制属性。本文提出了一种基于模型的聚类方法，该方法具有混合的二进制和连续变量，其中每个二进制属性由潜在连续变量生成，该潜在连续变量被分为合适的阈值，并从二进制数据中估算了潜在变量的得分。在经济学中，此类变量称为效用函数，并且假定二进制属性（是否存在公共服务或效用）由这些函数的低值和高值确定。在遗传学中，潜在反应被解释为发展定性特征或表型的“责任”。潜在变量的估计分数以及观察到的连续变量的分数，允许使用多元高斯混合模型进行聚类，而不是使用离散分布和连续分布的混合。在描述了该方法之后，本文介绍了模拟数据和实际数据的结果，并比较了多元高斯混合模型以及联合的多元和多项式分布混合的性能。结果表明，在分类错误率和聚类均值的再现方面，对于不同规模的变量，前者的模型优于混合模型。

著录项

来源
《Advances in Data Analysis and Classification 》 |2012年第1期| p.5-28| 共24页
作者
Isabella Morlini;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model [J] . Morlini Isabella Advances in data analysis and classification . 2012 ,第1期

机译：一种在高斯混合模型中对混合的二进制和连续变量进行聚类的潜在变量方法
2. A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications [J] . Amiri Leila, Khazaei Mojtaba, Ganjali Mojtaba Advances in statistical analysis . 2018 ,第1期

机译：混合潜在变量模型在异质种群中的混合数据及其应用
3. A mixture of generalized latent variable models for mixed mode and heterogeneous data [J] . Cai J.-H., Song X.-Y., Lam K.-H., Computational statistics & data analysis . 2011 ,第11期

机译：混合模式和异构数据的广义潜变量模型的混合
4. Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks with Mixed Continuous And Discrete Variables [C] . Scott Davies, Andrew Moore Conference on uncertainty in artificial intelligence . 2000

机译：混合网：混合连续和离散变量的贝叶斯网络中高斯的因子混合物
5. Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes [D] . Yu, Ching-Yun. 2002

机译：评估具有二元和连续结果的潜变量模型的模型拟合指数的截止标准
6. Does nature have joints worth carving? A discussion of taxometrics model-based clustering and latent variable mixture modeling [O] . G. H. Lubke, P. J. Miller -1

机译：大自然有值得雕刻的关节吗？关于分类法基于模型的聚类和潜在变量混合建模的讨论
7. Gaussian Mixture Modeling with Gaussian Process Latent Variable Models [O] . Nickisch, H., Rasmussen, C. 2010

机译：高斯混合模型与高斯过程潜变量模型
8. Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks With Mixed Continuous and Discrete Variables [R] . Davies, S. , Moore, A. 2000

机译：混合网：贝叶斯网络中具有混合连续和离散变量的高斯分解因子

A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model

摘要

著录项

相似文献

相关主题

期刊订阅