首页> 外文会议>Workshop on Knowledge Extraction and Integration for Deep Learning Architectures >Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts
【24h】

Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts

机译:目标概念在嘈杂的用户生成的文本中引导医学概念标准化

获取原文

摘要

Medical concept normalization (MCN) i.e., mapping of colloquial medical phrases to standard concepts is an essential step in analysis of medical social media text. The main drawback in existing state-of-the-art approach (Kalyan and Sangeetha, 2020b) is learning target concept vector representations from scratch which requires more training instances. Our model is based on RoBERTa and target concept embed-dings. In our model, we integrate a) target concept information in the form of target concept vectors generated by encoding target concept descriptions using SRoBERTa, state-of-the-art RoBERTa based sentence embedding model and b) domain lexicon knowledge by enriching target concept vectors with synonym relationship knowledge using retrofitting algorithm. It is the first attempt in MCN to exploit both target concept information as well as domain lexicon knowledge in the form of retrofitted target concept vectors. Our model outperforms all the existing models with an accuracy improvement up to 1.36% on three standard datasets. Further, our model when trained only on mapping lexicon synonyms achieves up to 4.87% improvement in accuracy.
机译:医学概念标准化(MCN)即标准概念的口语医学短语的映射是医学社交媒体文本分析的重要步骤。现有最先进的方法(Kalyan和Sangeetha,2020B)的主要缺点是从头划痕学习目标概念向量表示,这需要更多的培训实例。我们的模型基于罗伯塔和目标概念嵌入叮当。在我们的模型中,我们通过通过丰富目标概念向量来编码目标概念描述生成的目标概念向量生成的目标概念向量的形式,通过丰富目标概念向量,以通过编码目标概念描述来进行目标概念信息。通过丰富目标概念向量,域名Lexicon知识使用改装算法的同义词关系知识。它是MCN的第一次尝试,以利用目标概念信息以及改装目标概念向量的形式域名词典知识。我们的模型优于所有现有模型,精度提高高达三个标准数据集的1.36%。此外,我们的型号仅在映射Lexicon同义词上培训时,请在准确性上实现高达4.87%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号