首页> 外文期刊>Journal of classification >Outlier Identification in Model-Based Cluster Analysis
【24h】

Outlier Identification in Model-Based Cluster Analysis

机译:基于模型的聚类分析中的异常值识别

获取原文
获取原文并翻译 | 示例
           

摘要

In model-based clustering based on normal-mixture models, a few outlying observations can influence the cluster structure and number. This paper develops a method to identify these, however it does not attempt to identify clusters amidst a large field of noisy observations. We identify outliers as those observations in a cluster with minimal membership proportion or for which the cluster-specific variance with and without the observation is very different. Results from a simulation study demonstrate the ability of our method to detect true outliers without falsely identifying many non-outliers and improved performance over other approaches, under most scenarios. We use the contributed R package MCLUST for model-based clustering, but propose a modified prior for the cluster-specific variance which avoids degeneracies in estimation procedures. We also compare results from our outlier method to published results on National Hockey League data.
机译:在基于正态混合模型的基于模型的聚类中,一些孤立的观察结果可能会影响聚类的结构和数量。本文提出了一种识别这些现象的方法,但是,它并未尝试在大范围的嘈杂观测中识别集群。我们将离群值识别为具有最小成员比例的聚类中的观察值,或者具有或不具有观察值的聚类特定方差非常不同。仿真研究的结果表明,在大多数情况下,我们的方法能够检测真实的异常值,而不会错误地识别出许多非异常值,并且与其他方法相比,性能得到了改善。我们将贡献的R包MCLUST用于基于模型的聚类,但是针对特定于聚类的方差提出了一种修改后的先验,从而避免了估计程序中的退化。我们还将离群方法的结果与全国曲棍球联盟数据上公布的结果进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号