首页> 外文期刊>Electronic Journal of Statistics >Data enriched linear regression
【24h】

Data enriched linear regression

机译:数据丰富的线性回归

获取原文
       

摘要

We present a linear regression method for predictions on a small data set making use of a second possibly biased data set that may be much larger. Our method fits linear regressions to the two data sets while penalizing the difference between predictions made by those two models. The resulting algorithm is a shrinkage method similar to those used in small area estimation. We find a Stein-type result for Gaussian responses: when the model has $5$ or more coefficients and $10$ or more error degrees of freedom, it becomes inadmissible to use only the small data set, no matter how large the bias is. We also present both plug-in and AICc-based methods to tune our penalty parameter. Most of our results use an $L_{2}$ penalty, but we obtain formulas for $L_{1}$ penalized estimates when the model is specialized to the location setting. Ordinary Stein shrinkage provides an inadmissibility result for only $3$ or more coefficients, but we find that our shrinkage method typically produces much lower squared errors in as few as $5$ or $10$ dimensions when the bias is small and essentially equivalent squared errors when the bias is large.
机译:我们提出了一种线性回归方法,用于利用较小的第二个可能更大的第二个可能有偏见的数据集进行预测。我们的方法将线性回归拟合到两个数据集,同时补偿了这两个模型所做的预测之间的差异。所得算法是一种收缩方法,类似于小面积估计中使用的那些方法。我们发现高斯响应的Stein型结果:当模型具有$ 5 $或更多的系数和$ 10 $或更多的误差自由度时,无论偏差有多大,仅使用小数据集就变得不可接受。我们还介绍了基于插件和基于AICc的方法来调整惩罚参数。我们的大多数结果都使用$ L_ {2} $的罚款,但是当模型专门用于位置设置时,我们获得了$ L_ {1} $惩罚性估计的公式。普通斯坦因收缩法仅对$ 3 $或以上的系数提供了不可接受的结果,但我们发现,当偏差较小时,收缩法通常会在低至$ 5 $或$ 10 $尺寸的情况下产生低得多的平方误差,而当误差较小时,则基本上等效于平方误差偏见很大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号