首页> 外文期刊>Statistical Methodology >A comparison of two boxplot methods for detecting univariate outliers which adjust for sample size and asymmetry
【24h】

A comparison of two boxplot methods for detecting univariate outliers which adjust for sample size and asymmetry

机译:两种针对样本量和不对称性进行调整的用于检测单变量离群值的箱线图方法的比较

获取原文
获取原文并翻译 | 示例

摘要

It is important to identify outliers since inclusion, especially when using parametric methods, can cause distortion in the analysis and lead to erroneous conclusions. One of the easiest and most useful methods is based on the boxplot. This method is particularly appealing since it does not use any outliers in computing spread. Two methods, one by Carling and another by Schwertman and de Silva, adjust the boxplot method for sample size and skewness. In this paper, the two procedures are compared both theoretically and by Monte Carlo simulations. Simulations using both a symmetric distribution and an asymmetric distribution were performed on data sets with none, one, and several outliers. Based on the simulations, the Carling approach is superior in avoiding masking outliers, that is, the Carling method is less likely to overlook an outlier while the Schwertman and de Silva procedure is much better at reducing swamping, that is, misclassifying an observation as an outlier. Carling's method is to the Schwertman and de Silva procedure as comparisonwise versus experimentwise error rate is for multiple comparisons. The two methods, rather than being competitors, appear to complement each other. Used in tandem they provide the data analyst a more complete prospective for identifying possible outliers.
机译:识别异常值非常重要,因为包含(尤其是在使用参数方法时)可能导致分析失真并导致错误的结论。最简单,最有用的方法之一就是基于箱线图。该方法特别吸引人,因为它在计算范围内不使用任何异常值。两种方法(一种是Carling的方法,另一种是Schwertman和de Silva的方法)针对样本大小和偏度调整了箱线图方法。在本文中,从理论上和通过蒙特卡洛模拟对这两个过程进行了比较。对没有,一个和几个异常值的数据集执行了使用对称分布和非对称分布的模拟。根据模拟,Carling方法在避免掩盖离群值方面具有优势,也就是说,Carling方法不太可能忽略离群值,而Schwertman和de Silva过程在减少沼泽方面要好得多,也就是说,将观察结果误分类为离群值。 Carling的方法适用于Schwertman和de Silva程序,因为相对误差与实验误差率适用于多次比较。这两种方法不是相互竞争,而是相互补充。串联使用它们可以为数据分析人员提供更完整的预期,以识别可能的异常值。

著录项

  • 来源
    《Statistical Methodology》 |2009年第6期|604-621|共18页
  • 作者单位

    Department of Mathematics & Statistics, California State University, Chico, 400 West First Street, Chico, CA 95929-0525, United States;

    Department of Mathematics & Statistics, California State University, Chico, 400 West First Street, Chico, CA 95929-0525, United States;

    Department of Mathematics & Statistics, California State University, Chico, 400 West First Street, Chico, CA 95929-0525, United States;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    box-plot; masking; monte carlo; simulation; swamping;

    机译:箱形图;掩蔽;蒙特卡洛;模拟;沼泽地;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号