首页> 外文会议>International Conference on Research Challenges in Information Science >Increasing secondary diagnosis encoding quality using data mining techniques
【24h】

Increasing secondary diagnosis encoding quality using data mining techniques

机译:使用数据挖掘技术提高二级诊断编码质量

获取原文

摘要

In order to measure the medical activity, hospitals are required to manually encode information concerning an inpatient episode using International Classification of Disease (ICD-10). This task is time consuming and requires substantial training for the staff. We propose to help by speeding up and facilitating the tedious task of coding patient information, specially while coding some secondary diagnoses that are not well described in the medical resources such as discharge letter and medical records. Our approach leverages data mining techniques in order to explore medical databases of previously encoded secondary diagnoses and use the stored structured information (age, gender, diagnoses count, medical procedures...) to build a decision tree that assigns the proper secondary diagnosis code into the corresponding inpatient episode or indicates the impatient episodes that contains implausible secondary diagnoses. The results suggest that better performance could be achieved by using low level of diagnoses granularity along with adding some filters to balance the repartition of the negative and positive examples in the training set. The obtained results show that there is big variation in the evaluation scores of the studied diagnoses, the highest score is 75% using F1 measurement and the lowest 25% using F1 measurement which indicates further enhancements are needed to achieve better performance regardless of the encoded diagnosis. However, the average accuracy of all the studied secondary diagnoses is around 80% which indicates better negative predictions therefore it could be useful in the prevention or the detection of wrong coding assignments of secondary diagnoses in the inpatient stay.
机译:为了衡量医疗活动,要求医院使用国际疾病分类(ICD-10)手动编码有关住院发作的信息。这项任务很耗时,需要对员工进行大量培训。我们建议通过加快和简化对患者信息进行编码的繁琐任务来提供帮助,特别是在对一些在医疗资源中没有很好描述的二级诊断(例如出院信和病历)进行编码时。我们的方法利用数据挖掘技术来探索先前编码的二级诊断的医学数据库,并使用存储的结构化信息(年龄,性别,诊断计数,医疗程序...)来构建决策树,以将适当的二级诊断代码分配给相应的住院发作或表明急诊发作包含难以置信的继发性诊断。结果表明,通过使用低级别的诊断粒度以及添加一些过滤器以平衡训练集中阴性和阳性样本的重新分配,可以实现更好的性能。获得的结果表明,所研究诊断的评估分数存在较大差异,使用F1测量的最高分数为75%,使用F1测量的最低分数为25%,这表明需要进一步增强以实现更好的性能,而与编码的诊断无关。但是,所有研究过的二级诊断的平均准确度约为80%,这表明阴性预测更好,因此对于预防或检测住院期间二级诊断的错误编码分配可能很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号