【24h】

Imputation Techniques Analysis for Incomplete Medical Datasets in Case-Based Reasoning System

机译:基于案例的推理系统中不完全医学数据集的插补技术分析

获取原文
获取原文并翻译 | 示例

摘要

Case based reasoning (CBR) based expert systems are widely accepted for healthcare-related applications. CBR is analogous to how humans solve problems using prior experience. CBR is a lazy learner and maintains the cases in case-base to solve the encountered target case. The accuracy of any case based reasoning based system depends upon the quality of cases in its case base. Nowadays, an ample amount of electronic healthcare data is available, but unfortunately, electronic data come along with a growing amount of missing data. Multiple techniques are available to handle those missing values. One widely used way is to use imputation methods to compute missing values for different attributes, which utilize the total actual values in the given dataset. The study was conducted to find the best suited imputation technique for calculating missing values for constructing case base in case-based reasoning system. This study uses benchmark medical dataset to examine four combination imputation techniques and data preprocessing method. The author proposes a hybrid model that performs the pearson correlation to find the critical feature and then used imputation techniques to fill in missing values in different attributes. Experiment results show that the Pearson correlation and linear regression (LR) outperform the other 3 imputed techniques' and give more accurate imputed values. To justify the implication of data preprocessing methods with imputation technique, the findings are carried to medical dataset collected from UCI.
机译:基于案例推理 (CBR) 的专家系统被广泛接受用于医疗保健相关应用。CBR 类似于人类如何使用先前的经验解决问题。CBR 是一个懒惰的学习者,它将 case 保持在 case-base 中以解决遇到的目标 case。任何基于案例的推理系统的准确性都取决于其案例库中案例的质量。如今,有大量的电子医疗保健数据可用,但不幸的是,电子数据伴随着越来越多的缺失数据而来。有多种技术可用于处理这些缺失值。一种广泛使用的方法是使用插补方法来计算不同属性的缺失值,该方法利用给定数据集中的总实际值。进行这项研究是为了寻找最适合计算缺失值的插补技术,以便在基于案例的推理系统中构建案例库。本研究使用基准医学数据集来检验四种组合插补技术和数据预处理方法。作者提出了一个混合模型,该模型执行 Pearson 相关以找到关键特征,然后使用插补技术填充不同属性中的缺失值。实验结果表明,Pearson 相关和线性回归 (LR) 优于其他 3 种插补技术,并给出更准确的插补值。为了证明数据预处理方法与插补技术的含义,将研究结果带到从 UCI 收集的医学数据集中。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号