Influence of Pattern of Missing Data on Performance of Imputation Methods: An Example Using National Data on Drug Injection in Prisons

机译：数据丢失模式对插补方法性能的影响：以国家毒品监狱数据为例

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

>Background: Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern, to be addressed here, is the role of the pattern of missing data. >Methods: We used information of 2720 prisoners. Results derived from fitting regression model to whole data were served as gold standard. Missing data were then generated so that 10%, 20% and 50% of data were lost. In scenario 1, we generated missing values, at above rates, in one variable which was significant in gold model (age). In scenario 2, a small proportion of each of independent variable was dropped out. Four imputation methods, under different Event Per Variable (EPV) values, were compared in terms of selection of important variables and parameter estimation. >Results: In scenario 2, bias in estimates was low and performances of all methods for handing missing data were similar. All methods at all missing rates were able to detect significance of age. In scenario 1, biases in estimations were increased, in particular at 50% missing rate. Here at EPVs of 10 and 5, imputation methods failed to capture effect of age. >Conclusion: In scenario 2, all imputation methods at all missing rates, were able to detect age as being significant. This was not the case in scenario 1. Our results showed that performance of imputation methods depends on the pattern of missing data.

机译：>背景：政策制定者需要建立模型，以便能够检测出感染HIV高风险的人群。在国家数据集中经常看到不完整的记录和脏数据。缺少数据的存在挑战了模型开发的实践。几项研究表明，当丢失率适中时，插补方法的性能可以接受。在这里要解决的较少关注的问题之一是数据丢失模式的作用。 >方法：我们使用了2720名囚犯的信息。将回归模型与整体数据进行拟合得出的结果用作黄金标准。然后生成了丢失的数据，因此丢失了10％，20％和50％的数据。在方案1中，我们在一个高于黄金模型（年龄）的变量中以上述比率生成了缺失值。在方案2中，每个自变量的一小部分都退出了。根据重要变量的选择和参数估计，比较了不同事件每个变量（EPV）值下的四种插补方法。 >结果：在方案2中，估计偏差偏低，并且处理丢失数据的所有方法的性能均相似。所有缺失率的所有方法均能够检测年龄的重要性。在方案1中，估计的偏差有所增加，尤其是丢失率达到50％时。在此处，EPV为10和5时，插补方法无法捕获年龄的影响。 >结论：在方案2中，所有归因于所有缺失率的插补方法都能够检测出年龄是重要的。在方案1中并非如此。我们的结果表明，插补方法的性能取决于丢失数据的模式。

著录项

期刊名称 International Journal of Health Policy and Management
作者
Saiedeh Haji-Maghsoudi; Ali-akbar Haghdoost; Azam Rastegari; Mohammad Reza Baneshi;
展开▼
作者单位

展开▼
年(卷),期 2013(1),1
年度 2013
页码 69–77
总页数 9
原文格式 PDF
正文语种
中图分类病理学;
关键词
Missing Data MICE Expectation Maximum Algorithm Drug Injection National Data;

机译：缺失数据;MICE;期望最大值算法;药物注射;国家数据;

相似文献

外文文献
中文文献
专利

1. Long-term trends of wet inorganic nitrogen deposition in Rocky Mountain National Park: Influence of missing data imputation methods and associated uncertainty [J] . Schichtel Bret A., Gebhart Kristi A., Morris Kristi H., The Science of the Total Environment . 2019,第OCTa15期

机译：落基山国家公园湿式无机氮沉积的长期趋势：缺失数据插补方法和相关不确定性的影响
2. Long-term trends of wet inorganic nitrogen deposition in Rocky Mountain National Park: Influence of missing data imputation methods and associated uncertainty [J] . Schichtel Bret A., Gebhart Kristi A., Morris Kristi H., The Science of the Total Environment . 2019,第Octa15期

机译：岩石山国家公园湿无机氮沉积的长期趋势：缺少数据载荷方法的影响及相关不确定性
3. Performance of standard imputation methods for missing quality of life data as covariate in survival analysis based on simulations from the International Breast Cancer Study Group Trials Ⅵ and Ⅶ [J] . Communications in Statistics . 2019,第8a10期

机译：根据国际乳腺癌研究小组试验Ⅵ和simulation的模拟，在生存分析中缺少作为生存变量协变量的标准插补方法的性能
4. Application of the Modified Imputation Method to Missing Data to Increase Classification Performance [C] . Elenita T. Capariño, Ariel M. Sison, Ruji P. Medina IEEE International Conference on Computer and Communication Systems . 2019

机译：改进的插补方法在缺失数据中的应用以提高分类性能
5. Extension of the Regression Method for Imputation of Data with Monotone Missing Pattern using Multivariate Adaptive Regression Splines (MARS), with Applications to Systematic- Missing-At-Random (SMAR) Study Designs [D] . Lu, Feng. 2013

机译：利用多元自适应回归样条（MARS）扩展单调缺失模式数据插补的回归方法，并应用于系统随机缺失研究（SMAR）研究设计
6. Methylation data imputation performances under different representations and missingness patterns [O] . Pietro Di Lena, Claudia Sala, Andrea Prodi, 2020

机译：不同陈述和缺失模式下的甲基化数据估算性能
7. Influence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons [O] . Haji-Maghsoudi Saiedeh, Haghdoost Ali-Akbar, Rastegari Azam, 2013

机译：数据丢失模式对插补方法性能的影响：以国家监狱毒品注射数据为例

Influence of Pattern of Missing Data on Performance of Imputation Methods: An Example Using National Data on Drug Injection in Prisons

摘要

著录项

相似文献

相关主题

期刊订阅