首页> 外文期刊>Information retrieval >Biomedical term extraction: overview and a new methodology
【24h】

Biomedical term extraction: overview and a new methodology

机译:生物医学术语提取:概述和新方法

获取原文
获取原文并翻译 | 示例
           

摘要

Terminology extraction is an essential task in domain knowledge acquisition, as well as for information retrieval. It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems related (but not completely) to term extraction, e.g. noise, silence, low frequency, large-corpora, complexity of the multi-word term extraction process. In contrast, we propose a cutting edge methodology to extract and to rank biomedical terms, covering all the mentioned problems. This methodology offers several measures based on linguistic, statistical, graphic and web aspects. These measures extract and rank candidate terms with excellent precision: we demonstrate that they outperform previously reported precision results for automatic term extraction, and work with different languages (English, French, and Spanish). We also demonstrate how the use of graphs and the web to assess the significance of a term candidate, enables us to outperform precision results. We evaluated our methodology on the biomedical GENIA and LabTestsOnline corpora and compared it with previously reported measures.
机译:术语提取是领域知识获取以及信息检索中的基本任务。这也是旨在构建/丰富术语和本体的强制性第一步。如文献中经常提出的那样,现有的术语提取方法具有语言和统计方面的特征,并解决了(但不完全)与术语提取有关的一些问题,例如,术语提取。噪声,静音,低频,大型语料库,多词术语提取过程的复杂性。相反,我们提出了一种先进的方法来提取和排序生物医学术语,涵盖了所有提到的问题。该方法论提供了基于语言,统计,图形和网络方面的几种度量。这些度量以极高的精度提取和排序候选术语:我们证明,它们在自动术语提取方面优于先前报告的精度结果,并且可以使用不同的语言(英语,法语和西班牙语)。我们还演示了如何使用图形和Web评估候选词的重要性,从而使我们胜过精确结果。我们在生物医学GENIA和LabTestsOnline语料库上评估了我们的方法,并将其与以前报告的方法进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号