首页> 外文期刊>International Journal of Population Data Science >Development and validation of data quality rules in administrative health data using association rule mining
【24h】

Development and validation of data quality rules in administrative health data using association rule mining

机译:使用关联规则挖掘来开发和验证行政健康数据中的数据质量规则

获取原文
           

摘要

IntroductionData quality assessment is a challenging facet for researches using coded administrative health data. Our previous study had demonstrated the potentials of association rule mining to assess data quality. The objective of this study is to develop and validate a set of coding association rules for data quality assessment. Objectives and ApproachWe used the Canadian reabstracted hospital discharge abstract data (DAD) with clinical diagnosis coded in International Classification of Disease – 10th revision, Canada (ICD-10-CA) codes for rule development. The DAD data were divided into 5 age groups. Association rule mining were conducted on reabstracted DAD in each age group to extract ICD-10 coding association rules at the three and four digits levels. The rule strength was assessed using support and confidence. The rules will be reviewed by a panel of 5 physicians and 2 coding specialists to assess their appropriateness from clinical and coding perspectives using a modified Delphi rating ResultsIn total, 975 rules at the three digits level and 822 rules at the four digits level were learned from the data. Half of the rules were in the age group of ≥65 and no rules were found in the age group of 5 to 19. The interquartile range of rule confidences were 0.112 to 0.425 in the three digits level and 0.073 to 0.222 in the four digits level. Two-thirds of rules had the diagnosis codes related to endocrine and metabolic disorders and diseases of circulatory, respiratory and genitourinary systems. The panel review will be conducted in early April and will have the final set of rules available before the conference. Conclusion/ImplicationsThis study developed a set of validated ICD-10 coding association rules and creates a useful tool to cost-effectively assess data quality in routinely collected administrative health data.
机译:简介数据质量评估是使用编码的管理健康数据进行研究的一个具有挑战性的方面。我们之前的研究已经证明了关联规则挖掘在评估数据质量方面的潜力。这项研究的目的是开发和验证一套用于数据质量评估的编码关联规则。目的和方法我们将加拿大再入院的出院摘要数据(DAD)和临床诊断编码为《国际疾病分类-第10版,加拿大(ICD-10-CA)》编码,用于制定规则。 DAD数据分为5个年龄组。在每个年龄组中对重新提取的DAD进行关联规则挖掘,以提取三位数和四位数级别的ICD-10编码关联规则。使用支持和置信度评估规则强度。规则将由5位医师和2位编码专家组成的小组进行审查,以使用经过修改的Delphi评分结果从临床和编码角度评估其适用性。总共,从中了解了三位数级别的975条规则和四位数级别的822条规则数据。一半的规则位于≥65岁的年龄组中,而在5至19岁的年龄组中未找到规则。三位数水平的规则置信度的四分位数范围为0.112至0.425,四位数水平的规则置信度的四分位数范围为0.073至0.222 。三分之二的规则具有与内分泌和代谢疾病以及循环系统,呼吸系统和泌尿生殖系统疾病相关的诊断代码。小组审查将在4月初进行,并将在会议之前提供最终的规则集。结论/意义这项研究开发了一套经过验证的ICD-10编码关联规则,并创建了一个有用的工具来经济有效地评估常规收集的行政健康数据中的数据质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号