【24h】

Regression diagnostics in large and high dimensional data

机译:大维和高维数据的回归诊断

获取原文
获取原文并翻译 | 示例

摘要

“Learning methods” play a key role in the fields of statistics, data mining, and artificial intelligence, intersecting with areas of engineering and other disciplines. These methods for analyzing and modeling data come in two flavors: supervised and unsupervised learning. Regression analysis and classification are two well known supervised learning techniques. To get an effective model from regression analysis it is necessary to check and preprocess the data set in astronomy, bio-informatics, image analysis, computer vision etc, especially when the data sets are large and high dimensional. In these industries large or fat data appear with unusual observations (outliers) very naturally. Checking raw data for outliers in regression is regression diagnostics. Most of the popular diagnostic methods are not good enough for large and high dimensional data. The aim of this paper is to provide a new measure for identifying influential observations in linear regression for large high dimensional data.
机译:“学习方法”在统计,数据挖掘和人工智能领域中起着关键作用,与工程和其他学科领域相交。这些用于分析和建模数据的方法有两种形式:有监督的学习和无监督的学习。回归分析和分类是两种众所周知的监督学习技术。为了从回归分析中获得有效的模型,必须检查和预处理天文学,生物信息学,图像分析,计算机视觉等方面的数据集,尤其是在数据集较大且维数较大时。在这些行业中,大数据或大数据很自然地出现了异常观察值(异常值)。在回归中检查原始数据是否存在离群值是回归诊断。大多数流行的诊断方法对于大型和高维数据都不够好。本文的目的是提供一种新方法,用于识别大型高维数据的线性回归中的影响性观察。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号