Redaction of Protected Health Information in EHRs using CRFs and Bi-directional LSTMs

机译：使用CRFS和双向LSTMS在EHRS中重新缩略EHRS中的受保护的健康信息

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes the de-identification of personally identifiable information (PIIs) in electronic health records (EHRs) using two models of conditional random fields (CRFs) and bidirectional long short term memory networks (LSTMs). Most medical records store private information such as PATIENT NAME, HOSPITAL NAME, LOCATION, etc. that needs to be de-identified or redacted before being passed on for further medical research. The process of removing such information using machine learning techniques is started with pre-processing of raw data by tokenization and detection of sentences. On comparing the techniques, it is noted that CRFs require manual feature engineering to train the model whereas LSTM is capable of handling long term dependencies without much insight about the dataset. Bi-directional LSTM network was used to generate context information from suitable word representations. Finally, a predictive layer was applied to predict the protected health information (PHI) terms having maximum probability. Evaluated with the i2b2 gold data set of clinical narratives of patients of 2014 De-identification challenge, we propose an efficient solution for redaction using two models, both of which achieve good F-scores for PHIs of all types. The LSTM-based model achieved a micro-F1 measure of 0.9592, which performs better than the CRF-based model.

机译：本文介绍了使用两种条件随机字段（CRF）和双向长短短期存储网络（LSTMS）的两种模型来识别电子健康记录（EHRS）的个人识别信息（EHRS）。大多数医疗记录存储私人信息，如需要在进一步的医学研究中传递或重新签出或重新删除的患者姓名，医院名称，位置等。使用机器学习技术删除此类信息的过程是通过令叫声化和检测句子的原始数据预处理。在比较技术的情况下，应注意，CRFS需要手动功能工程来训练模型，而LSTM能够处理长期依赖性，而不是对数据集有很多洞察力。双向LSTM网络用于生成来自合适字表示的上下文信息。最后，应用预测层以预测具有最大概率的受保护的健康信息（PHI）术语。通过2014年患者的临床叙述评估了2014年去鉴定挑战的临床叙事，我们提出了一种使用两种模型进行重新加工的有效解决方案，这两种型号都可以实现所有类型的PHI的良好F分数。基于LSTM的模型实现了0.9592的微F1度量，其比基于CRF的模型更好。

著录项

来源
《International Conference on Reliability, Infocom Technologies and Optimization》|2018年|910p|共5页
会议地点
作者
Apar Madan; Ann Mary George; Apurva Singh; M.P.S. Bhatia;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Hidden Markov models; Labeling; Task analysis; Logic gates; Tagging; Feature extraction; Microsoft Windows;

机译：隐马尔可夫模型;标签;任务分析;逻辑门;标记;特征提取;微软Windows;

相似文献

外文文献
中文文献
专利

1. Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks [J] . Rui Zhao, Ruqiang Yan, Jinjiang Wang, Sensors . 2017,第2期

机译：学习使用卷积双向LSTM网络监控机器健康
2. Bidirectional LSTM-CRF for Adverse Drug Event Tagging in Electronic Health Records [J] . Susmitha Wunnava, Xiao Qin, Tabassum Kakar, JMLR: Workshop and Conference Proceedings . 2018,第2010期

机译：双向LSTM-CRF用于电子健康记录中的不良药物事件标签
3. Private health insurance catches the updraft: More patients again protect themselves privately [PKV spürt Aufwind: Mehr Patienten sichern sich wieder privat ab] [J] . SchmidtK. Der Klinikarzt . 2009,第12期

机译：私人健康保险捕获上升机：更多患者再次保护自己[PKV Senses Updind：更多患者私下保护自己]
4. Redaction of Protected Health Information in EHRs using CRFs and Bi-directional LSTMs [C] . Apar Madan, Ann Mary George, Apurva Singh, International Conference on Reliability, Infocom Technologies and Optimization . 2018

机译：使用CRF和双向LSTM编辑EHR中受保护的健康信息
5. Estimating the Effects of Electronic Health Records (EHRs) Sophistication and EHRs Years of Experience on Health Care Quality, Patient Experience, 30-Day Readmissions, and Profitability in U.S Acute Care Hospitals [D] . Mose, Jason N. 2017

机译：估算电子健康记录（EHRS）复杂性和ehrs多年的医疗保健质量，患者体验，30天的阅览和美国急性护理医院盈利能力的影响
6. Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks [O] . Rui Zhao, Ruqiang Yan, Jinjiang Wang, 2017

机译：通过卷积双向LSTM网络学习监视机器健康
7. Bi-directional LSTM-CNNs-CRF for Italian Sequence Labeling and Multi-Task Learning [O] . Pierpaolo Basile, Pierluigi Cassotti, Lucia Siciliani, 2017

机译：用于意大利序列标记和多任务学习的双向LSTM-CNNS-CRF
8. Health Care Providers' Role in Protecting EHRs: Implications for Consumer Support of EHRs, HIE and Patient-Provider Communication. [R] . Hughes, P., Patel, V., Pritts, J. 2014

机译：医疗保健提供者在保护EHR中的作用：对EHR，HIE和患者 - 提供者沟通的消费者支持的影响。

Redaction of Protected Health Information in EHRs using CRFs and Bi-directional LSTMs

摘要

著录项

相似文献

相关主题

期刊订阅