...
首页> 外文期刊>International Journal of Population Data Science >Impact of linkage quality on inferences drawn from analyses using imperfectly matched data with high rates of linkage errors
【24h】

Impact of linkage quality on inferences drawn from analyses using imperfectly matched data with high rates of linkage errors

机译:链接质量对使用不完美匹配的数据进行分析得出的推论的影响,这些数据具有很高的链接错误率

获取原文

摘要

IntroductionStudies based on high-quality linked data in developed countries show that residual linkage errors impact the bias and precision of subsequent analyses. Since 2015, we conducted point-of-contact interactive record linkage (PIRL) between serological survey data and manually digitised medical records with low data quality from three clinics in rural Tanzania. Objectives and ApproachWe sought to determine the impact of the substantial linkage errors made by automated probabilistic linkage (a commonly used, less accurate, but much cheaper alternative to PIRL) on the bias and precision of inferences drawn from Cox regression analyses, comparing time from a positive HIV diagnostic test to registration at a local HIV care and treatment clinic (CTC) by testing modality (sero-survey vs. clinic). Using PIRL links as the gold standard, we quantified false/missed matches, compared characteristics between linked and unlinked data, and evaluated regression estimates at low, medium, and high (25th, 50th, and 75th percentile) match score thresholds. ResultsBetween 2015-2017, 297 and 147 individuals with gold standard links received HIV+ test results in sero-surveys and clinics, respectively. Automated probabilistic linkage correctly identified 276 individuals (positive predictive value [PPV]=62%) at the low threshold and 43 individuals (PPV=96%) at the high threshold. At the lowest threshold, false matches were more likely to be clinic testers and less likely to register at CTC. These differences attenuated with increased threshold. Testing modality was significantly associated with time to CTC registration in the gold standard data (adjusted hazard ratio [HR] 6.42, 95%CI 4.45-9.28). Increasing false matches progressively weakened the association (low threshold: HR 4.99, 95%CI 3.45-7.21). Increases in missed matches were strongly correlated with a reduction in the precision of coefficient estimates (R-squared=0.94; p=0.0001). Conclusion/ImplicationsWhile the significance of inferences did not change, a clear direction of bias was identified. High rates of false matches in this setting reduced the magnitude of the association; missed matches reduced precision. Adjusting for these biases could provide more robust results using data with considerable linkage errors.
机译:简介发达国家基于高质量链接数据的研究表明,残留链接错误会影响后续分析的偏倚和准确性。自2015年以来,我们在坦桑尼亚农村地区三家诊所进行的血清学调查数据与数据质量较低的手动数字化病历之间进行了接触点交互式记录链接(PIRL)。目标和方法我们试图确定自动概率链接(一种常用的,准确性较低,但比PIRL便宜得多的替代品)造成的实质性链接错误对Cox回归分析得出的推论的偏倚和精度的影响,并比较阳性HIV诊断测试,可以通过测试方式(血清调查与诊所)在当地的HIV护理和治疗诊所(CTC)注册。使用PIRL链接作为黄金标准,我们对错误/丢失的匹配进行了量化,比较了链接数据和未链接数据之间的特征,并评估了低,中和高(第25、50和75%百分比)匹配分数阈值的回归估计。结果在2015年至2017年之间,分别有297名和147名具有金标准链接的人在血清调查和诊所中分别获得了HIV +检测结果。自动概率连锁可以正确识别低阈值的276个人(阳性预测值[PPV] = 62%)和高阈值的43个人(PPV = 96%)。在最低阈值上,错误匹配更有可能是临床测试人员,在CTC注册的可能性较小。这些差异随着阈值的增加而减弱。在金标准数据中,测试方式与注册CTC的时间显着相关(调整后的危险比[HR] 6.42,95%CI 4.45-9.28)。错误匹配的增加会逐渐削弱关联性(下限:HR 4.99,95%CI 3.45-7.21)。错过比赛的增加与系数估算精度的降低密切相关(R平方= 0.94; p = 0.0001)。结论/启示虽然推理的重要性没有改变,但可以确定一个明确的偏向。在这种情况下,较高的错误匹配率降低了关联的程度;错过比赛降低了精度。调整这些偏差可以使用链接误差较大的数据提供更可靠的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号