UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

机译：UMLSbert：使用统一的医疗语言系统Metathesaurus的临床域名知识增强语境嵌入

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Contextual word embedding models, such as BioBERT and Bio_ClinicaiBERT, have achieved state-of-the-art results in biomedical natural language processing tasks by focusing their pre-training process on domain-specific corpora. However, such models do not take into consideration structured expert domain knowledge from a knowledge base. We introduce UmlsBERT, a contextual embedding model that integrates domain knowledge during the pre-training process via a novel knowledge augmentation strategy. More specifically, the augmentation on UmlsBERT with the Unified Medical Language System (UMLS) Metathesaurus is performed in two ways: (ⅰ) connecting words that have the same underlying 'concept' in UMLS and (ⅱ) leveraging semantic type knowledge in UMLS to create clinically meaningful input embeddings. By applying these two strategies, UmlsBERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models on common named-entity recognition (NER) and clinical natural language inference tasks.

机译：上下文中的嵌入模型（例如Biobert和Bio_ClinicaIbert）通过将其在特定于域的语料库上专注于他们的预培训过程来实现生物医学自然语言处理任务的最新导致。但是，这些模型不考虑来自知识库的结构化专家领域知识。我们介绍了UMLSbert，一个上下文嵌入模型，通过新颖的知识增强策略在预培训过程中整合域知识。更具体地说，使用统一的医疗语言系统（UMLS）Metathesaurus的UmlSbert上的增强是以两种方式执行的：（Ⅰ）在UMLS中具有相同的底层“概念”的单词和（Ⅱ）利用UML中的语义类型知识来创建临床有意义的输入嵌入。通过应用这两种策略，UMLSbert可以将临床域知识编码为Word Embeddings并优于公共命名实体识别（NER）和临床自然语言推理任务的现有域特定模型。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|1744-1753|共10页
会议地点
作者
George Michalopoulos; Yuanxin Wang; Hussam Kaka; Helen Chen; Alexander Wong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Interoperability and Mapping Between Knowledge Organization Systems: Metathesaurus-Unified Medical Language System of the National Library of Medicine [J] . de Andrade Julietti, Ginez de lara Marilda Lopes Knowledge Organization . 2016,第2期

机译：知识组织系统之间的互操作性和映射：国立医学图书馆的同义词库统一医学语言系统
2. Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System [J] . Donghua Chen, Runtong Zhang, Kecheng Liu, International Journal of Environmental Research and Public Health . 2018,第6期

机译：使用统一医学语言系统从在线健康社区中的帖子中发现知识
3. Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge [J] . Arman Cohan, Nazli Goharian ACM SIGIR FORUM . 2017,第cd期

机译：使用词嵌入和领域知识将语境化的引用进行科学概括
4. Application of the unified medical language system metathesaurus to intelligent radiological case retrieval [C] . Author(s): Vikas Bhushan M.D. Univ. of California/Los Angeles Los Angeles CA USA, Ricky K. Taira Univ. of California/Los Angeles Los Angeles CA USA, Hing-Ming Chan Univ. of California/Los Angeles Los Angeles CA USA. Medical Imaging 1994: PACS: Design and Evaluation . 1994

机译：统一医学语言系统元同义词库在放射智能病例检索中的应用
5. Domain Specific Languages for Small Embedded Systems [D] . Grebe, Mark. 2018

机译：小型嵌入式系统的领域特定语言
6. Use of the Metathesaurus and SPECIALIST Lexicon of the Unified Medical Language System Lexical Matching and Domain-Specific Free-Text to Identify Undocumented Vocabulary [O] . Eric M. Mills, Jeff R. Wilcke, Holly S. Bender 1998

机译：使用统一医学语言系统的Metathesaurus和SPECIALIST词典词汇匹配和特定领域的自由文本来识别未记录的词汇
7. Mapping of electronic health records in Spanish to the unified medical language system metathesaurus [O] . Pérez Miguel Naiara 2017

机译：西班牙语中的电子健康记录到统一医学语言系统元同义词库的映射

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

摘要

著录项

相似文献

相关主题

期刊订阅