首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives
【2h】

Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives

机译:基于集成的方法来改善电子病历叙述的去识别性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Text de-identification is an application of clinical natural language processing that offers significant efficiency and scalability advantages. Hence, various learning algorithms have been applied to this task to yield better performance. Instead of choosing the best individual learning algorithm, we aim to improve de-identification by constructing ensembles that lead to more accurate classification. We present three different ensemble methods that combine multiple de-identification models trained from deep learning, shallow learning, and rule-based approaches. Each model is capable of automated de-identification without manual medical expertise. Our experimental results show that the stacked learning ensemble is more effective than other ensemble methods, producing the highest recall, the most important metric for de-identification. The stacked ensemble achieved state-of-the-art performance on the 2014 i2b2 dataset with 97.04% precision, 94.45% recall, and 95.73% F1 score.
机译:文本取消识别是临床自然语言处理的一种应用,具有显着的效率和可伸缩性优势。因此,各种学习算法已应用于此任务以产生更好的性能。我们没有选择最佳的个体学习算法,而是旨在通过构建可导致更准确分类的合奏来提高去识别性。我们提出了三种不同的集成方法,这些方法结合了从深度学习,浅层学习和基于规则的方法中训练来的多个去标识模型。每种模型都能够在无需人工医学专业知识的情况下自动进行身份识别。我们的实验结果表明,堆叠学习集成比其他集成方法更有效,产生了最高的召回率,这是取消识别的最重要指标。堆叠的整体在2014 i2b2数据集上达到了最先进的性能,准确率达97.04%,召回率达94.45%,F1得分达95.73%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号