首页> 外文OA文献 >Finding Respondents in the Forest: A Comparison of Logistic Regression and Random Forest Models for Response Propensity Weighting and Stratification
【2h】

Finding Respondents in the Forest: A Comparison of Logistic Regression and Random Forest Models for Response Propensity Weighting and Stratification

机译:在森林中寻找受访者:Logistic回归和随机森林模型的响应倾向加权和分层比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Survey response rates for modern surveys using many different modes are trending downward leaving the potential for nonresponse biasesin estimates derived from using only the respondents. The reasons for nonresponse may be complex functions of known auxiliary variables orunknown latent variables not measured by practitioners. The degree to which the propensity to respond is associated with survey outcomescasts light on the overall potential for nonresponse biases for estimates of means and totals. The most common method for nonresponseadjustments to compensate for the potential bias in estimates has been logistic and probit regression models. However, for more complexnonresponse mechanisms that may be nonlinear or involve many interaction effects, these methods may fail to converge and thus fail togenerate nonresponse adjustments for the sampling weights. In this paper we compare these traditional techniques to a relatively new datamining technique- random forests – under a simple and complex nonresponse propensity population model using both direct and propensitystratification nonresponse adjustments. Random forests appear to offer marginal improvements for the complex response model over logisticregression in direct propensity adjustment, but have some surprising results for propensity stratification across both response models.
机译:使用许多不同模式的现代调查的调查响应率呈下降趋势,留下了仅使用受访者得出的无响应偏差估计的可能性。不响应的原因可能是已知辅助变量的复杂函数或未由从业者测量的未知潜变量。答复倾向与调查结果相关的程度表明,对于均值和总计的估计,无答复偏差的总体潜力。用来弥补估计中潜在偏差的无响应调整的最常用方法是逻辑模型和概率回归模型。但是,对于可能是非线性的或涉及许多交互作用的更复杂的无响应机制,这些方法可能无法收敛,因此无法为采样权重生成无响应调整。在本文中,我们将传统技术与相对较新的数据挖掘技术-随机森林进行比较-在简单和复杂的无响应倾向人口模型下,使用直接和倾向分层无响应调整。随机森林似乎在直接倾向性调整中相对于logistic回归为复杂的响应模型提供了些微的改进,但在这两种响应模型中的倾向分层中都有一些令人惊讶的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号