首页> 外文会议>Workshop on NLP and pseudonymisation >Augmenting a De-identification System for Swedish Clinical Text Using Open Resources and Deep Learning
【24h】

Augmenting a De-identification System for Swedish Clinical Text Using Open Resources and Deep Learning

机译:使用开放资源和深度学习增强瑞典临床文本的去识别系统

获取原文

摘要

Electronic patient records are produced in abundance every day and there is a demand to use them for research or management purposes. The records, however, contain information in the free text that can identify the patient and therefore tools are needed to identify this sensitive information. The aim is to compare two machine learning algorithms, Long Short-Term Memory (LSTM) and Conditional Random Fields (CRF) applied to a Swedish clinical data set annotated for de-identification. The results show that CRF performs better than deep learning with LSTM, with CRF giving the best results with an F_1 score of 0.91 when adding more data from within the same domain. Adding general open data did, on the other hand, not improve the results.
机译:每天都会大量产生电子病历,因此需要将其用于研究或管理目的。但是,记录在自由文本中包含可以识别患者身份的信息,因此需要使用工具来识别此敏感信息。目的是比较两种机器学习算法,长短期记忆(LSTM)和条件随机字段(CRF),这些算法应用于注释为取消识别的瑞典临床数据集。结果表明,CRF的效果优于使用LSTM进行的深度学习,当从同一域中添加更多数据时,CRF的效果最好,F_1得分为0.91。另一方面,添加一般开放数据并不能改善结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号