...
首页> 外文期刊>Frontiers in Public Health >Can the Use of Bayesian Analysis Methods Correct for Incompleteness in Electronic Health Records Diagnosis Data? Development of a Novel Method Using Simulated and Real-Life Clinical Data
【24h】

Can the Use of Bayesian Analysis Methods Correct for Incompleteness in Electronic Health Records Diagnosis Data? Development of a Novel Method Using Simulated and Real-Life Clinical Data

机译:是否可以使用贝叶斯分析方法对电子健康的不完整性进行纠正诊断数据?利用模拟和现实生活临床数据的开发新方法

获取原文
           

摘要

Background Patient health information is collected routinely in electronic health records (EHRs) and used for research purposes, however, many health conditions are known to be under-diagnosed or under-recorded in EHRs. In research, missing diagnoses result in under-ascertainment of true cases, which attenuates estimated associations between variables and results in a bias towards the null. Bayesian approaches allow the specification of prior information to the model, such as the likely rates of missingness in the data. This paper describes a Bayesian analysis approach which aimed to reduce attenuation of associations in EHR studies focussed on conditions characterised by under-diagnosis. Methods Study 1: We created synthetic data, produced to mimic EHR data where diagnoses were under-recorded. We fitted logistic regression (LR) models with and without Bayesian priors representing rates of misclassification in the data. We examined the LR parameters estimated by models with and without priors. Study 2: We used EHR data from UK primary care in a case-control design with dementia as the outcome. We fitted LR models examining risk factors for dementia, with and without generic prior information on misclassification rates. We examined LR parameters estimated by models with and without the priors, and estimated classification accuracy using Area Under the Receiver Operating Characteristic. Results Study 1: In synthetic data, estimates of LR parameters were much closer to the true parameter values when Bayesian priors were added to the model; with no priors, parameters were substantially attenuated by under-diagnosis. Study 2: The Bayesian approach ran well on real life clinic data from UK primary care, with the addition of prior information increasing LR parameter values in all cases. In multivariate regression models, Bayesian methods showed no improvement in classification accuracy over traditional LR. Conclusions The Bayesian approach showed promise but had implementation challenges in real clinical data: prior information on rates of misclassification was difficult to find. Our approach made a number of assumptions, such as diagnoses being missing at random. Further development is needed to integrated the method into studies using real-life EHR data. Our findings nevertheless highlight the importance of developing methods to address missingness in EHR data.
机译:背景技术患者健康信息在电子健康记录(EHRS)中,并用于研究目的,已知许多健康状况被诊断或在EHR中被诊断出来。在研究中,缺少诊断结果在确定的真实情况下,这衰减了变量之间的估计关联,并导致偏向无效。贝叶斯方法允许将先前信息的规范到模型,例如数据中缺失的可能率。本文介绍了一种贝叶斯分析方法,旨在减少EHR研究中的关联关联衰减,其侧重于诊断下的病症。方法研究1:我们创建了合成数据,生产的是模拟EHR数据,其中诊断被记录。我们拟合了具有和没有贝叶斯女前沿的Logistic回归(LR)模型,代表数据中错误分类的率。我们检查了模型估计的LR参数,无需前瞻。研究2:我们在患者控制设计中使用英国初级保健的EHR数据与痴呆症作为结果。我们合适的LR模型检查痴呆症的风险因素,有和没有通用事先有关错误分类率的信息。我们检查了由模型估计的LR参数,并且在接收器操作特性下使用区域的估计分类精度。结果研究1:在合成数据中,当贝叶斯前锋添加到模型时,LR参数的估计比真正参数值更接近;没有前瞻性,通过诊断下,参数基本上衰减。研究2:贝叶斯方法在英国初级保健中的现实生活诊所数据上运行良好,并在所有情况下添加了先前的信息增加了LR参数值。在多变量回归模型中,贝叶斯方法没有改善传统LR的分类准确性。结论贝叶斯方法展示了承诺,但实际临床资料中有实施挑战:难以找到关于错误分类率的事先信息。我们的方法制作了许多假设,例如随机丢失的诊断。需要进一步开发来使用现实生活EHR数据将该方法集成到研究中。然而,我们的调查结果突出了开发方法来解决EHR数据中缺失的方法的重要性。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号