首页> 外文会议>International conference on enterprise information systems >Using Data Mining for Prediction of Hospital Length of Stay: An Application of the CRISP-DM Methodology
【24h】

Using Data Mining for Prediction of Hospital Length of Stay: An Application of the CRISP-DM Methodology

机译:使用数据挖掘来预测住院时间:CRISP-DM方法论的应用

获取原文

摘要

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.
机译:如今,医院正在收集与患者记录有关的大量数据。所有这些数据都具有可用于改善医院决策的宝贵知识。数据挖掘技术恰好旨在从原始数据中提取有用的知识。这项工作描述了基于CRISP-DM方法的医学数据挖掘项目方法的实现。从2000年至2013年的最新实际数据是从葡萄牙一家医院收集的,并与住院患者有关。目的是根据住院过程中常用的指标(例如性别,年龄,发作类型,医学专科)来预测一般医院的住院时间。在数据准备阶段,清理数据并选择和转换变量,从而产生14个输入。接下来,在建模阶段,采用了一种回归方法,其中比较了六种学习方法:平均预测,多元回归,决策树,人工神经网络集成,支持向量机和随机森林。最佳学习模型是通过随机森林法获得的,该模型具有较高的确定值系数(0.81)。然后使用敏感性分析程序打开该模型,该程序揭示了三个有影响的输入属性:医院发作类型,患者住院的物理服务以及相关的医学专业。这样提取的知识证实,所获得的预测模型是可信的,并且具有支持医院管理者决策的潜在价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号