A Comparative Study of Outliers Identification Methods in Univariate Data Set

首页> 外文期刊>Advanced Science Letters >A Comparative Study of Outliers Identification Methods in Univariate Data Set

【24h】

A Comparative Study of Outliers Identification Methods in Univariate Data Set

机译：异常值识别方法在单变量数据集中的比较研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

An outlier is an observation that is inconsistent and deviate markedly from the majority of the data. The detection of outliers is a crucial part in data analysis since the contaminated observation can cause negative impacts on the data analysis. The existence of outliers can causedistortion in the data analysis and lead to erroneous conclusions. In the past literatures, there are several outliers detection methods were proposed to identify the multiple outliers in the data. In this paper, the existing outlier detection methods are discussed and reviewed. We generatecontaminated data with extreme and mild outlying observations. The performances of Tukey’s boxplot, Hampel’s test, modified Z-score method, sequential fences and extreme standardized deviate test are applied on a simulated data set in order to investigate extensively theirsensitivity towards outliers and the time consuming of each technique in detecting the outliers. The results suggest that the Tukey’s boxplot method offers substantially better performance and sequential fences procedure shows good results under certain confidence level.

机译：异常值是一个观察，这是不一致的，从大多数数据中显着偏离。异常值的检测是数据分析中的关键部分，因为污染观察可能会对数据分析产生负面影响。异常值的存在可以在数据分析中导致故事，并导致错误的结论。在过去的文献中，提出了几种异常值检测方法，以识别数据中的多个异常值。本文讨论和审查了现有的异常检测方法。我们将数据与极端和轻度外围观察变成了达到的数据。 Tukey的Boxplot，汉普尔测试，修改的 -score方法，顺序围栏和极端标准化偏差测试的性能应用于模拟数据集，以便对异常值进行广泛的敏感性以及每个耗时的敏感性检测异常值的技术。结果表明，Tukey的Boxpot方法提供了大量更好的性能和顺序围栏程序在某些置信水平下显示出良好的结果。

著录项

来源
《Advanced Science Letters》 |2017年第2期|共6页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Boxplot; Masking; Outliers Detection; Swamping;

机译：Boxplot;掩蔽;异常值检测;淋上;

相似文献

外文文献
中文文献
专利

1. A Comparative Study of Outliers Identification Methods in Univariate Data Set [J] . Advanced Science Letters . 2017,第2期

机译：异常值识别方法在单变量数据集中的比较研究
2. Identification of Outliers in Oxazolines and Oxazoles High Dimension Molecular Descriptor Dataset Using Principal Component Outlier Detection Algorithm and Comparative Numerical Study of Other Robust Estimators [J] . Doreswamy, Chanabasayya .M. Vastrad International Journal of Data Mining & Knowledge Management Process . 2013,第4期

机译：使用主成分离群值检测算法和其他鲁棒估计量的比较数值研究，确定恶唑啉和恶唑高维分子描述符数据集中的离群值
3. Outlier Detection in Data Streams — A Comparative Study of Selected Methods [J] . Agnieszka Duraj, Piotr S. Szczepaniak Procedia Computer Science . 2021,第a期

机译：数据流中的异常检测 - 所选方法的比较研究
4. Outlier Labeling Method for Univariate Data for Module Test and Die Sort [C] . T. Saeger, B. Kleven, I. Otero, International Conference on Compound Semiconductor Manufacturing Technology . 2016

机译：用于模块测试和DIE排序的单变量数据的异常值标记方法
5. Identification of multivariate outliers in large data sets [D] . Werner, Mark 2003

机译：大数据集中多元离群值的识别
6. Universal Linear Fit Identification: A Method Independent of Data Outliers and Noise Distribution Model and Free of Missing or Removed Data Imputation [O] . K. K. L. B. Adikaram, M. A. Hussein, M. Effenberger, -1

机译：通用线性拟合识别：一种独立于数据离群值和噪声分布模型且无缺失或缺失数据插补的方法
7. A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets [O] . Seo Songwon 2006

机译：单变量数据集中异常值检测方法的综述和比较

A Comparative Study of Outliers Identification Methods in Univariate Data Set

摘要

著录项

相似文献

相关主题

期刊订阅