首页> 外文期刊>BMC Bioinformatics >Dynamic model updating (DMU) approach for statistical learning model building with missing data
【24h】

Dynamic model updating (DMU) approach for statistical learning model building with missing data

机译:缺失数据统计学习模型建筑的动态模型更新(DMU)方法

获取原文
           

摘要

Developing statistical and machine learning methods on studies with missing information is a ubiquitous challenge in real-world biological research. The strategy in literature relies on either removing the samples with missing values like complete case analysis (CCA) or imputing the information in the samples with missing values like predictive mean matching (PMM) such as MICE. Some limitations of these strategies are information loss and closeness of the imputed values with the missing values. Further, in scenarios with piecemeal medical data, these strategies have to wait to complete the data collection process to provide a complete dataset for statistical models. This study proposes a dynamic model updating (DMU) approach, a different strategy to develop statistical models with missing data. DMU uses only the information available in the dataset to prepare the statistical models. DMU segments the original dataset into small complete datasets. The study uses hierarchical clustering to segment the original dataset into small complete datasets followed by Bayesian regression on each of the small complete datasets. Predictor estimates are updated using the posterior estimates from each dataset. The performance of DMU is evaluated by using both simulated data and real studies and show better results or at par with other approaches like CCA and PMM. DMU approach provides an alternative to the existing approaches of information elimination and imputation in processing the datasets with missing values. While the study applied the approach for continuous cross-sectional data, the approach can be applied to longitudinal, categorical and time-to-event biological data.
机译:开发统计和机器学习方法对缺失信息的研究是现实世界生物学研究中无处不在的挑战。文献中的策略依赖于去除具有缺失值的样本,如完整的案例分析(CCA),或者将样本中的信息抵消具有缺失值的样本中的信息,例如诸如小鼠的预测性平均匹配(PMM)。这些策略的一些局限性是缺失值的归资值的信息丢失和接近。此外,在具有零碎医疗数据的情况下,这些策略必须等待完成数据收集过程,以提供统计模型的完整数据集。本研究提出了一种动态模型更新(DMU)方法,一种不同的策略来开发缺失数据的统计模型。 DMU仅使用数据集中可用的信息来准备统计模型。 DMU将原始数据集分为小型完整数据集。该研究使用分层群集将原始数据集分段为小型完整数据集,然后在每个小型完整数据集中进行贝叶斯回归。使用来自每个数据集的后估计更新预测估计值。通过使用模拟数据和实际研究来评估DMU的性能,并显示出更好的结果或与其他方法(如CCA和PMM)相提并论。 DMU方法提供了在处理具有缺失值的数据集时现有的信息消除方法和夸张方法的替代方法。虽然该研究应用了连续横截面数据的方法,但该方法可以应用于纵向,分类和时间的生物数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号