首页> 外文会议>International Conference on Machine Learning >Variable Selection in Model-Based Clustering: To Do or To Facilitate
【24h】

Variable Selection in Model-Based Clustering: To Do or To Facilitate

机译:基于模型的聚类中的变量选择:要做或促进

获取原文

摘要

Variable selection for cluster analysis is a difficult problem. The difficulty originates not only from the lack of class information but also the fact that high-dimensional data are often multifaceted and can be meaningfully clustered in multiple ways. In such a case the effort to find one subset of attributes that presumably gives the "best" clustering may be misguided. It makes more sense to facilitate variable selection by domain experts, that is, to systematically identify various facets of a data set (each being based on a subset of attributes), cluster the data along each one, and present the results to the domain experts for appraisal and selection. In this paper, we propose a generalization of the Gaussian mixture model, show its ability to cluster data along multiple facets, and demonstrate it is often more reasonable to facilitate variable selection than to perform it.
机译:集群分析的变量选择是一个难题。难度不仅源于缺乏阶级信息,而且还有一种事实,即高维数据通常是多方面,可以多种方式有意义地聚集。在这种情况下,找出可能误导了可能会给“最佳”聚类的属性的一个子集。促进域专家的变量选择更有意义,即系统地识别数据集的各个方面(每个基于属性子集),沿着每个群体群集数据,并将结果呈现给域专家用于评估和选择。在本文中,我们提出了高斯混合模型的概括,展示了沿多个方面纳入数据的能力,并且证明了促进可变选择比执行它更为合理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号