Outlier Identification in Model-Based Cluster Analysis

Evans Katie; Love Tanzy; Thurston Sally W.

首页> 外文期刊>Journal of classification >Outlier Identification in Model-Based Cluster Analysis

【24h】

Outlier Identification in Model-Based Cluster Analysis

机译：基于模型的聚类分析中的异常值识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In model-based clustering based on normal-mixture models, a few outlying observations can influence the cluster structure and number. This paper develops a method to identify these, however it does not attempt to identify clusters amidst a large field of noisy observations. We identify outliers as those observations in a cluster with minimal membership proportion or for which the cluster-specific variance with and without the observation is very different. Results from a simulation study demonstrate the ability of our method to detect true outliers without falsely identifying many non-outliers and improved performance over other approaches, under most scenarios. We use the contributed R package MCLUST for model-based clustering, but propose a modified prior for the cluster-specific variance which avoids degeneracies in estimation procedures. We also compare results from our outlier method to published results on National Hockey League data.

机译：在基于正态混合模型的基于模型的聚类中，一些孤立的观察结果可能会影响聚类的结构和数量。本文提出了一种识别这些现象的方法，但是，它并未尝试在大范围的嘈杂观测中识别集群。我们将离群值识别为具有最小成员比例的聚类中的观察值，或者具有或不具有观察值的聚类特定方差非常不同。仿真研究的结果表明，在大多数情况下，我们的方法能够检测真实的异常值，而不会错误地识别出许多非异常值，并且与其他方法相比，性能得到了改善。我们将贡献的R包MCLUST用于基于模型的聚类，但是针对特定于聚类的方差提出了一种修改后的先验，从而避免了估计程序中的退化。我们还将离群方法的结果与全国曲棍球联盟数据上公布的结果进行比较。

著录项

来源
《Journal of classification》 |2015年第1期|共22页
作者
Evans Katie; Love Tanzy; Thurston Sally W.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自然科学理论与方法论;
关键词
Normal-mixture models; Influential points; MCLUST; Prior; National Hockey League;

机译：正混合模型;影响点;MCLUST;先前;全国曲棍球联盟;

相似文献

外文文献
中文文献
专利

1. Outlier Identification in Model-Based Cluster Analysis [J] . Evans Katie, Love Tanzy, Thurston Sally W. Journal of classification . 2015,第1期

机译：基于模型的聚类分析中的异常值识别
2. SOM ensemble for unsupervised outlier analysis. Application to outlier identification in the Gaia astronomical survey [J] . Diego Fustes, Carlos Dafonte, Bernardino Arcay, Expert Systems with Application . 2013,第5期

机译：SOM集合用于无监督的离群值分析。在盖亚天文测量中的异常值识别中的应用
3. Identification of typical building daily electricity usage profiles using Gaussian mixture model-based clustering and hierarchical clustering [J] . Li Kehua, Ma Zhenjun, Robinson Duane, Applied Energy . 2018,第Deca1期

机译：使用基于高斯混合模型的聚类和分层聚类来识别典型的建筑物日常用电配置文件
4. A model-based approach for text clustering with outlier detection [C] . Jianhua Yin, Jianyong Wang IEEE International Conference on Data Engineering . 2016

机译：具有异常检测功能的基于模型的文本聚类方法
5. Multivariate outlier mining using cluster analysis: Case study - National Health Interview Survey. [D] . Sharker, Md Monir Hossain. 2010

机译：使用聚类分析的多变量离群值挖掘：案例研究-国民健康访问调查。
6. Outlier Identification in Model-Based Cluster Analysis [O] . Katie Evans, Tanzy Love, Sally W. Thurston -1

机译：基于模型的聚类分析中的异常值识别
7. Outlier Identification in Model-Based Cluster Analysis [O] . Katie Evans, Tanzy Love, Sally W. Thurston 2015

机译：基于模型的集群分析中的异常识别
8. Use of Mahalanobis Distance for Detecting Outliers and Outlier Clusters in Markedly Non-Normal Data: A Vehicular Traffic Example [R] . Warren, R., Smith, R. F., Cybenko, A. K. 2011

机译：使用马哈拉诺比斯距离检测显着非正态数据中的异常值和异常值群集：车载流量示例

Outlier Identification in Model-Based Cluster Analysis

摘要

著录项

相似文献

相关主题

期刊订阅