首页> 外文会议>16th workshop on biomedical natural language processing >Adapting Pre-trained Word Embeddings For Use In Medical Coding
【24h】

Adapting Pre-trained Word Embeddings For Use In Medical Coding

机译:调整预训练的词嵌入以用于医疗编码

获取原文
获取原文并翻译 | 示例

摘要

Word embeddings are a crucial component in modern NLP. Pre-trained embeddings released by different groups have been a major reason for their popularity. However, they are trained on generic corpora, which limits their direct use for domain specific tasks. In this paper, we propose a method to add task specific information to pre-trained word embeddings. Such information can improve their utility. We add information from medical coding data, as well as the first level from the hierarchy of ICD-10 medical code set to different pre-trained word embeddings. We adapt CBOW algorithm from the word2vec package for our purpose. We evaluated our approach on five different pre-trained word embeddings. Both the original word embeddings, and their modified versions (the ones with added information) were used for automated review of medical coding. The modified word embeddings give an improvement in f-score by 1 % on the 5-fold evaluation on a private medical claims dataset. Our results show that adding extra information is possible and beneficial for the task at hand.
机译:词嵌入是现代NLP中的关键组成部分。不同小组发布的经过预训练的嵌入是其受欢迎的主要原因。但是,他们接受了通用语料库的培训,这限制了它们直接用于特定领域的任务。在本文中,我们提出了一种将任务特定信息添加到预训练词嵌入中的方法。这样的信息可以提高其效用。我们将医学编码数据中的信息以及ICD-10医学代码层次结构中的第一级添加到不同的预训练词嵌入中。我们将word2vec软件包中的CBOW算法用于我们的目的。我们对五种不同的预训练词嵌入方法进行了评估。原始单词嵌入及其修改版本(带有附加信息的版本)都用于自动检查医学编码。在私人医疗索赔数据集的5倍评估中,修改后的单词嵌入使f得分提高了1%。我们的结果表明,添加额外的信息是可能的,并且对手头的任务有益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号