首页> 外文期刊>International journal of metadata, semantics and ontologies >Disambiguation of semantic types in complex noun phrases for extracting candidate terms
【24h】

Disambiguation of semantic types in complex noun phrases for extracting candidate terms

机译:消除复杂名词短语中语义类型的歧义以提取候选词

获取原文
获取原文并翻译 | 示例
       

摘要

Mapping concepts from medical resources to structured medical documents is a prerequisite for many automatic document processing tasks. These resources are characterised by an abundance of material to represent any given concept. Moreover, the resources may include ambiguous terms in unstructured form that lead to distorted results in automating biomedical text mining. This paper is an exploratory study on disambiguation of semantic types for extracting a structured taxonomy from unstructured reports. Specifically, the terms that will be disambiguated are terms that have more than one semantic type in the Unified Medical Language System (UMLS) Metathesaurus. We suggest a word sense disambiguation algorithm that utilises the UMLS is-a hierarchy, augmented with a higher level representing semantic groups, as a knowledge base. The purpose is to explore all possible commonalities to classify simple or composed candidate terms with the Nearest Common Kinship (NCK). Experiments with the training corpora provide encouraging results.
机译:从医学资源到结构化医学文档的概念映射是许多自动文档处理任务的先决条件。这些资源的特点是有很多材料可以代表任何给定的概念。此外,资源可能包含非结构化形式的歧义术语,导致生物医​​学文本挖掘自动化结果失真。本文是对从非结构化报告中提取结构化分类法的语义类型进行歧义化的探索性研究。具体而言,将要消除歧义的术语是统一医学语言系统(UMLS)元同义词库中具有一种以上语义类型的术语。我们建议一种词义消歧算法,该算法利用UMLS is-层次结构,并以表示语义组的更高级别进行扩充,作为知识库。目的是探索所有可能的共通性,以使用“最近的共同血统”(NCK)对简单或组合的候选术语进行分类。训练语料库的实验提供了令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号