首页> 外文会议>IEEE International Conference on Collaboration and Internet Computing >A Short Survey of LSTM Models for De-identification of Medical Free Text
【24h】

A Short Survey of LSTM Models for De-identification of Medical Free Text

机译:对LSTM模型的简短调查,用于证明医疗自由文本

获取原文

摘要

The confidentiality of patient information is legislated by governmental regulations in various countries, such as the Health Insurance Portability and Accountability Act (HIPAA) standards in the USA. Under these laws, adequate protections must be in place to safeguard patients' health records, which are often big data comprised of free text. Machine learning approaches are extensively used for the automated de-identification of medical free text, with outstanding results obtained from several studies that incorporate long short-term memory (LSTM) networks. These networks are a variant of the recurrent neural network (RNN) architecture. Our survey of LSTM models dates back five years, and the contribution of the findings is appreciable. Performance-wise, LSTMs generally surpassed other types of models used in automated de-identification of free text, namely conditional random field (CRF) algorithms and rule-based algorithms. In addition, hybrid or ensemble LSTM models did not outperform LSTM -only models. Finally, we note that the customization of gold-standard, de-identification datasets may result in overfitted models.
机译:患者信息的机密性受到各国政府法规的立法,例如美国的健康保险便携式和问责法(HIPAA)标准。根据这些法律,必须制定足够的保护以保护患者的健康记录,这通常是由自由文本组成的大数据。机器学习方法广泛用于医疗自由识别的自动解除识别,从若干研究中获得了优异的结果,这些研究包括长短短期内存(LSTM)网络。这些网络是经常性神经网络(RNN)架构的变体。我们对LSTM模型的调查历史追溯到五年,结果可观。性能明智,LSTM通常超过自动取消识别自由文本的其他类型的模型,即条件随机字段(CRF)算法和基于规则的算法。此外,混合或集合LSTM模型没有胜过LSTM-only模型。最后,我们注意到金标的定制,去识别数据集可能导致过度的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号