Model-assisted calibration of non-probability sample survey data using adaptive LASSO

Chen Jack Kuang Tsung; Valliant Richard L.; Elliott Michael R.

首页> 外文期刊>Survey methodology >Model-assisted calibration of non-probability sample survey data using adaptive LASSO

【24h】

Model-assisted calibration of non-probability sample survey data using adaptive LASSO

机译：使用自适应LASSO的非概率样本调查数据的模型辅助校准

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfilling. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

机译：基于概率抽样的框架主导了调查研究，因为它提供了精确的数学工具来评估抽样的变异性。但是，成本增加和响应率下降正在扩大非概率样本的使用范围，特别是在一般人群中，从网络调查中抽取的个人样本变得越来越便宜且易于访问。但是由于差异性访问，关注程度和其他因素，非概率样本面临选择偏见的风险。对总体中已知统计总数的校准提供了一种潜在地减少非概率样本中选择偏倚影响的方法。在这里，我们表明，只要在预测模型中包含真实预测变量的子集，使用自适应LASSO的模型校准就可以得出总体总数的一致估计值，从而可以在不存在过度填充风险的情况下包含大量可能的协变量。我们显示，当需要大量协变量来确定真实模型时，使用自适应LASSO进行的模型校准相对于标准竞争对手（例如广义回归（GREG）估计器），提供了相对于均方误差的改进估计，实际上没有效率损失当较小的型号就足够时，可以超过GREG。我们还导出总体总数的闭合形式方差估计量，并将其行为与自举估计量进行比较。我们以来自“国家健康访问调查”的数据作为一个真实的例子作为结束。

著录项

来源
《Survey methodology》 |2018年第1期|117-144|共28页
作者
Chen Jack Kuang Tsung; Valliant Richard L.; Elliott Michael R.;
展开▼
作者单位

Survey Monkey Inc, Palo Alto, CA 94301 USA;

Univ Michigan, Inst Social Res, Survey Res Ctr, Ann Arbor, MI USA;

Univ Michigan, Sch Publ Hlth, Survey Res Ctr, Inst Social Res, Ann Arbor, MI 48109 USA|Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48109 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Adaptive LASSO estimators; Generalized regression estimator; Non-representative sample; Over-fitting; Variable selection; Oracle property;

机译：自适应LASSO估计器;广义回归估计器;非代表性样本;过度拟合;变量选择;Oracle属性;
入库时间 2022-08-18 04:16:36

相似文献

外文文献
中文文献
专利

1. Model-assisted estimation of change in forest biomass over an 11year period in a sample survey supported by airborne LiDAR: A case study with post-stratification to provide "activity data" [J] . N?sset E., Bollands?s O.M., Gobakken T., Remote Sensing of Environment: An Interdisciplinary Journal . 2013,第Null期

机译：在机载LiDAR支持的样本调查中，模型辅助估计的11年内森林生物量的变化：进行事后分层以提供“活动数据”的案例研究
2. Adapting by calibration the sample size of a phase III trial on the basis of phase II data [J] . Daniele De Martini Pharmaceutical statistics. . 2011,第2期

机译：通过校准根据II期数据调整III期试验的样本量
3. A Model-calibration Approach to Using Complete Auxiliary Information from Stratified Sampling Survey Data [J] . WU, Chang-chun, ZHANG, 数学季刊：英文版 . 2006,第002期

机译：从分层抽样调查数据中使用完整辅助信息的模型校准方法
4. Adaptive Digital Calibration of Over-Sampled Data Converter Systems [C] . Thomas Holm Hansen, Lars Risbo Audio Engineering Society convention . 2003

机译：过度采样的数据转换器系统的自适应数字校准
5. Using LASSO to Calibrate Non-probability Samples using Probability Samples. [D] . Chen, Kuang Tsung. 2016

机译：使用LASSO使用概率样本校准非概率样本。
6. Weighting Non-probability and Probability Sample Surveys in Describing Cancer Catchment Areas [O] . Ronaldo Iachan, Lewis Berman, Tonja M. Kyle, -1

机译：描述癌症集水区的加权非概率和概率抽样调查
7. Model-assisted estimation of change in forest biomass over an 11 year period in a sample survey supported by airborne LiDAR: A case study with post-stratification to provide "activity data" [O] . Næsset, Erik, Bollandsås, Ole Martin, Gobakken, Terje, 2013

机译：在机载激光雷达支持的抽样调查中，模型辅助估算11年期间森林生物量变化：一个案例研究，后分层提供“活动数据”
8. Measuring Intent to Participate and Participation in the 2010 Census and Their Correlates and Trends: Comparisons of RDD Telephone and Non-Probability Sample Internet Survey Data. Study Series (Survey Methodology No. 2010-15) [R] . Pasek, J., Krosnick, J. A. 2010

机译：衡量参与和参与2010年人口普查及其相关性和趋势的意图：RDD电话和非概率样本互联网调查数据的比较。研究系列（调查方法第2010-15号）

Model-assisted calibration of non-probability sample survey data using adaptive LASSO

摘要

著录项

相似文献

相关主题

期刊订阅