首页> 美国政府科技报告 >Multiple Outliers in Linear Regression: Advances in Detection Methods, Robust Estimation, and Variable Selection
【24h】

Multiple Outliers in Linear Regression: Advances in Detection Methods, Robust Estimation, and Variable Selection

机译:线性回归中的多个异常值:检测方法,稳健估计和变量选择的进展

获取原文

摘要

Empirical evidence suggests unusual or outlying observations in data sets are much more prevalent than one might expect; 5 to 10% on average for many industries. This research addresses multiple outliers in the linear regression model. Although reliable for a single or a few outliers, standard diagnostic techniques from an ordinary least squares (OLS) fit can fail to identify multiple outliers. The parameter estimates, diagnostic quantities and model inferences from the contaminated data set can be significantly different from those obtained with the clean data. The researcher requires a dependable method to identify and accommodate these multiple outliers. This research tests both direct methods from algorithms and indirect methods from robust regression estimators to identify multiple outliers. A comprehensive Monte Carlo simulation study evaluates the impact that outlier density and geometry, regressor variable dimension, and outlying distance have on numerous published methods. The performance study focuses on outlier configurations likely to be encountered in practice and uses a designed experiment approach. The results for each scenario provide insight and limitations in performance for each technique. Recommendations are given for each technique. OLS is the optimal regression estimator under a set of assumptions on the distribution of the error term and predictor variables. Compound robust regression estimators have been proposed as alternatives when some OLS assumptions fail. Compound estimators can accommodate multiple outliers and limit the influence of the observations with remote levels of predictor variables. This research proposes a new compound estimator that is more effective for extreme observations in X space and high dimension than currently published methods. This research also addresses the variable selection problem for compound robust regression estimators.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号