Comparison of Outlier Detection Procedures in Multiple Linear Regressions

Gafar Matanmi Oyeyemi; Abdulwasiu Bukoye; Imam Akeyede

首页> 外文期刊>American Journal of Mathematics and Statistics >Comparison of Outlier Detection Procedures in Multiple Linear Regressions

【24h】

Comparison of Outlier Detection Procedures in Multiple Linear Regressions

机译：多元线性回归中离群值检测程序的比较

获取原文

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Regression analysis has become one of most widely used statistical tools for analyzing multifactor data. It is appealing because it provides a conceptually simple method for investigating functional relationship among variables. A relationship is expressed in the form of an equation or a model connecting the response or dependent variable and one or more explanatory or predictor variables. The major problem that statisticians have been confronted with, while dealing with regression analysis, is presence of outliers in data. An outlier is an observation that lies outside the overall pattern of a distribution. In other words it is a point which falls more than 1.5 times the interquartile range above the third quartile or below the first quartile. Several statistics are available to detect whether or not outlier(s) are present in data. Therefore, in this study, a simulation study was conducted to investigate the performance of Deffits, Cooks distance and Mahalanobis distance at different proportion of outliers (10%, 20% and 30% )and for various sample sizes (10, 30 and 100) in first, second or both independent variables. The data were generated using R software from normal distribution while the outliers were from uniform distribution. Findings: For small and medium sample sizes and at 10% level of outliers, Mahalanobis distance should be employed for her accuracy of detection of outliers. For small, medium and large sample size with higher percentage of outliers, Deffits should be employed. For small, medium and large sample sizes, Deffits should be used in detecting outlier signal irrespective of the percentage levels of outliers in the data set. For small sample and low percent of outliers Mahalanobis distance should be employed for easy computation.

机译：回归分析已成为分析多因素数据的最广泛使用的统计工具之一。之所以具有吸引力，是因为它提供了一种概念上简单的方法来研究变量之间的功能关系。关系以方程或模型的形式表示，该方程或模型将响应或因变量与一个或多个解释变量或预测变量相连。在进行回归分析时，统计人员面临的主要问题是数据中存在异常值。离群值是位于分布总体模式之外的观察值。换句话说，该点下降到四分位数间距的1.5倍以上，高于第三四分位数或低于第一四分位数。有几种统计数据可用于检测数据中是否存在异常值。因此，在这项研究中，我们进行了模拟研究，以研究在不同比例的异常值（10％，20％和30％）和各种样本量（10、30和100）下的Deffits，Cooks距离和Mahalanobis距离的性能。在第一，第二或两个自变量中。数据是使用R软件从正态分布生成的，而异常值是从均匀分布生成的。发现：对于中小样本量和离群值在10％的水平，应采用马氏距离来检测离群值。对于具有较高异常值百分比的小样本，中样本和大样本，应使用拟合。对于小，中和大样本量，无论数据集中异常值的百分比水平如何，都应使用Deffits检测异常信号。对于小样本和低百分比的离群值，应采用马氏距离，以便于计算。

著录项

来源
《American Journal of Mathematics and Statistics》 |2015年第1期|共5页
作者
Gafar Matanmi Oyeyemi; Abdulwasiu Bukoye; Imam Akeyede;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类概率论与数理统计;
关键词

相似文献

外文文献
中文文献
专利

1. A Comparative analysis of multiple outlier detection procedures in the linear regression model [J] . James W. Wisnowski, Douglas C. Montgomery, James R. Simpson Computational statistics & data analysis . 2001,第3期

机译：线性回归模型中多个离群值检测程序的比较分析
2. Performances Comparison of Information Criteria for Outlier Detection in Multiple Regression Models Having Multicollinearity Problems using Genetic Algorithms [J] . Ozlem Gurunlu Alma Matematika . 2013,第2013期

机译：遗传算法在具有多重共线性问题的多元回归模型中离群值检测信息准则的性能比较
3. Multiple-Case Outlier Detection in Multiple Linear Regression Model Using Quantum-Inspired Evolutionary Algorithm [J] . Salena Akter, Mozammel H A Khan Journal of Computers . 2010,第12期

机译：多重线性回归模型中的多种外壳异常检测使用量子启动展开算法
4. A Comparative Study of Outlier Detection Procedures in Multiple Linear Regression [C] . Pimpan Ampanthong, Prachoom Suwattee International MultiConference of Engineers and Computer Scientists . 2009

机译：多元线性回归中异常检测程序的比较研究
5. A COMPARISON OF UNBIASED, BIASED, AND WEIGHTED MULTIPLE LINEAR REGRESSION APPROACHES TO SUPPORT EDUCATIONAL POLICY IN THE IDENTIFICATION OF OUTLIER SCHOOLS (RIDGE REGRESSION, PREDICTION, RESIDUAL, EXPLANATION, NEEDS, ASSESSMENT) [D] . BIGELOW, ROBERT ASHLEY. 1984

机译：比较，偏重和加权的多元线性回归方法来支持对局外学校的教育政策的识别（岭回归，预测，残差，解释，需求，评估）
6. Detecting outliers when fitting data with nonlinear regression – a new method based on robust nonlinear regression and the false discovery rate [O] . Harvey J Motulsky, Ronald E Brown 2006

机译：用非线性回归拟合数据时检测异常值–基于鲁棒非线性回归和错误发现率的新方法
7. Multiple outliers detection procedures in linear regression [O] . Adnan Robiah, Mohamad Mohd Nor, Setan Halim 2003

机译：线性回归中的多个异常值检测程序
8. Multiple Outliers in Linear Regression: Advances in Detection Methods, Robust Estimation, and Variable Selection [R] . Wisnowski, J. W. 1999

机译：线性回归中的多个异常值：检测方法，稳健估计和变量选择的进展

Comparison of Outlier Detection Procedures in Multiple Linear Regressions

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅