首页> 外文期刊>Journal of statistical computation and simulation >Outlier detection for high dimensional data using the Comedian approach
【24h】

Outlier detection for high dimensional data using the Comedian approach

机译:使用喜剧演员方法检测高维数据的异常值

获取原文
获取原文并翻译 | 示例

摘要

The process of detection of outliers is an interesting and important aspect in the analysis of data, as it could impact the inference. There are various methods available in the literature for detection of outliers in multivariate data [V. Barnett and T. Lewis, Outliers in Statistical Data, John Wiley & Sons, Chichester, 1994] using the Mahalanobis distance measure. An attempt is made to propose an alternate method of outlier detection based on the comedian introduced by Falk [On MAD and Comedians, Ann. Inst. Statist. Math. 49 (1997), pp. 615-644]. The proposed method is computationally efficient with high breakdown value and low computation time. Further, important properties, namely, success rates (SR) and false detection rates (FDR) are studied and compared with some of the well-known outlier detection methods through a simulation study. The Comedian method has high SR and low FDR for all combination of parameters. On removal of the detected outliers or down weighing, the same, highly robust and approximately affine equivariant estimators of multivariate location and scatter can be obtained. Finally, the method is applied to well-known real data sets to evaluate its performance.
机译:离群值的检测过程是数据分析中一个有趣且重要的方面,因为它可能会影响推断。文献中有多种方法可用于检测多元数据中的离群值[V. Barnett和T. Lewis,《统计数据中的异常值》,约翰·威利父子出版社,奇切斯特,1994年]使用马氏距离测度。尝试基于Falk [On MAD and Comedians,Ann。研究所统计员。数学。 49(1997),第615-644页]。该方法计算效率高,击穿值高,计算时间短。此外,研究了重要的属性,即成功率(SR)和错误检测率(FDR),并通过模拟研究将其与一些众所周知的异常检测方法进行了比较。对于所有参数组合,Comedian方法均具有较高的SR和较低的FDR。去除检测到的离群值或降低权重后,可以获得多元位置和散布的相同,高度鲁棒且近似仿射的等变估计量。最后,将该方法应用于众所周知的真实数据集以评估其性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号