首页> 外文会议>IEEE Globecom Workshops >Exploiting Text Data to Improve Critical Care Mortality Prediction
【24h】

Exploiting Text Data to Improve Critical Care Mortality Prediction

机译:利用文本数据以提高关键护理死亡率预测

获取原文

摘要

There has been a significant increase in the quantity, quality, and availability of unstructured clinical notes, motivating numerous machine learning approaches that leverage such data to improve predictive capabilities in medical settings. However, the question of whether patient group properties under observation influence the effectiveness of including unstructured data sources remains unanswered. The inclusion of unstructured clinical notes adds both an acquisition cost such as recording the notes by a clinician and converting records to an appropriate digital format, and a computational cost such as more complex and computationally expensive machine learning algorithms. Thus, it is important to understand the potential benefits offered by these unstructured data sources before attempting to use them. We empirically evaluate the performance impact of including unstructured clinical notes when performing mortality prediction by reproducing 29 previously published studies in this area. We use two common feature extraction methods, Word2Vec and Bag-Of-Words, with two existing machine learning models, XGBoost and Logistic Regression. Our results show that our approaches have significantly different performances depending on the properties of the patient group under study. Additionally, we identify several key findings that can be used to predict whether the inclusion of data from unstructured clinical notes will be beneficial based on properties of the patient groups.
机译:非结构化临床笔记的数量,质量和可用性有显着增加,激励许多机器学习方法,这些方法利用此类数据来提高医疗环境中的预测能力。然而,关于观察患者群体性质的问题是否影响包括非结构化数据来源的有效性仍未得到答复。包含非结构化的临床票据增加了临床医生和将记录转换为适当的数字格式的收购成本,以及将记录转换为适当的数字格式,以及更复杂和计算昂贵的机器学习算法等计算成本。因此,在尝试使用它们之前了解这些非结构化数据来源提供的潜在福利是很重要的。我们在经验上评估通过在该地区再现29次出版的研究时进行死亡率预测时,包括非结构化临床注意的性能影响。我们使用两个常见的特征提取方法,Word2VEC和单词,具有两个现有的机器学习模型,XGBoost和Logistic回归。我们的研究结果表明,我们的方法对根据研究中的患者组的性质具有显着不同的性能。此外,我们识别若干关键发现,可用于预测来自非结构化临床票据的数据是否基于患者组的性质是有益的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号