首页> 外文期刊>Advanced Science Letters >A Comparative Study of Outliers Identification Methods in Univariate Data Set
【24h】

A Comparative Study of Outliers Identification Methods in Univariate Data Set

机译:异常值识别方法在单变量数据集中的比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

An outlier is an observation that is inconsistent and deviate markedly from the majority of the data. The detection of outliers is a crucial part in data analysis since the contaminated observation can cause negative impacts on the data analysis. The existence of outliers can causedistortion in the data analysis and lead to erroneous conclusions. In the past literatures, there are several outliers detection methods were proposed to identify the multiple outliers in the data. In this paper, the existing outlier detection methods are discussed and reviewed. We generatecontaminated data with extreme and mild outlying observations. The performances of Tukey’s boxplot, Hampel’s test, modified Z-score method, sequential fences and extreme standardized deviate test are applied on a simulated data set in order to investigate extensively theirsensitivity towards outliers and the time consuming of each technique in detecting the outliers. The results suggest that the Tukey’s boxplot method offers substantially better performance and sequential fences procedure shows good results under certain confidence level.
机译:异常值是一个观察,这是不一致的,从大多数数据中显着偏离。异常值的检测是数据分析中的关键部分,因为污染观察可能会对数据分析产生负面影响。异常值的存在可以在数据分析中导致故事,并导致错误的结论。在过去的文献中,提出了几种异常值检测方法,以识别数据中的多个异常值。本文讨论和审查了现有的异常检测方法。我们将数据与极端和轻度外围观察变成了达到的数据。 Tukey的Boxplot,汉普尔测试,修改的 -score方法,顺序围栏和极端标准化偏差测试的性能应用于模拟数据集,以便对异常值进行广泛的敏感性以及每个耗时的敏感性检测异常值的技术。结果表明,Tukey的Boxpot方法提供了大量更好的性能和顺序围栏程序在某些置信水平下显示出良好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号