首页> 外文期刊>Computational statistics & data analysis >Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models
【24h】

Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models

机译:使用一系列广义线性模型,将数据集中的缺失值乘以混合测量标度

获取原文
获取原文并翻译 | 示例
           

摘要

Multiple imputation is a commonly used approach to deal with missing values. In this approach, an imputer repeatedly imputes the missing values by taking draws from the posterior predictive distribution for the missing values conditional on the observed values, and releases these completed data sets to analysts. With each completed data set the analyst performs the analysis of interest, treating the data as if it were fully observed. These analyses are then combined with standard combining rules, allowing the analyst to make appropriate inferences which take into account the uncertainty present due to the missing data. In order to preserve the statistical properties present in the data, the imputer must use a plausible distribution to generate the imputed values. In data sets containing variables with different measurement scales, e.g. some categorical and some continuous variables, this is a challenging problem. A method is proposed to multiply impute missing values in such data sets by modelling the joint distribution of the variables in the data through a sequence of generalised linear models, and data augmentation methods are used to draw imputations from a proper posterior distribution using Markov Chain Monte Carlo (MCMC). The performance of the proposed method is illustrated using simulation studies and on a data set taken from a breast feeding study. (c) 2015 Published by Elsevier B.V.
机译:多重插补是处理缺失值的常用方法。在这种方法中,冒名顶替者通过从后验预测分布中抽取基于观察值的缺失值来反复估算缺失值,并将这些完整的数据集发布给分析人员。对于每个完成的数据集,分析师将进行感兴趣的分析,将数据视为已被完全观察到。然后,将这些分析与标准合并规则进行合并,从而使分析人员可以进行适当的推断,并考虑由于缺少数据而导致的不确定性。为了保留数据中存在的统计属性,推动者必须使用合理的分布来生成估算值。在包含具有不同测量范围的变量的数据集中,例如一些分类变量和一些连续变量,这是一个具有挑战性的问题。提出了一种方法,该方法通过使用一系列广义线性模型对数据中变量的联合分布进行建模,从而对此类数据集中的归因缺失值进行乘积,并使用马尔可夫链蒙特(Markov Chain Monte)方法使用数据扩充方法从适当的后验分布中得出归因卡洛(MCMC)。拟议方法的性能通过仿真研究和母乳喂养研究的数据集进行了说明。 (c)2015年由Elsevier B.V.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号