首页> 外文期刊>International journal of knowledge discovery in bioinformatics >Towards a Mixed Approach to Extract Biomedical Terms from Text Corpus
【24h】

Towards a Mixed Approach to Extract Biomedical Terms from Text Corpus

机译:寻求从文本语料库中提取生物医学术语的混合方法

获取原文
获取原文并翻译 | 示例
       

摘要

The objective of this paper is to present a methodology to extract and rank automatically biomedical terms from free text. The authors present new extraction methods taking into account linguistic patterns specialized for the biomedical domain, statistic term extraction measures such as C-value and statistic keyword extraction measures such as Okapi BM25, and TFIDF. These measures are combined in order to improve the extraction process and the authors investigate which combinations are the more relevant associated to different contexts. Experimental results show that an appropriate harmonic mean of C-value associated to keyword extraction measures offers better precision, both for single-word and multi-words term extraction. Experiments describe the extraction of English and French biomedical terms from a corpus of laboratory tests available online. The results are validated by using UMLS (in English) and only MeSH (in French) as reference dictionary.
机译:本文的目的是提出一种从自由文本中自动提取生物医学术语并对其进行排名的方法。作者提出了新的提取方法,其中考虑了专门用于生物医学领域的语言模式,统计术语提取方法(例如C值)和统计关键字提取方法(例如Okapi BM25和TFIDF)。将这些措施组合在一起以改善提取过程,并且作者研究了哪些组合与不同环境更相关。实验结果表明,与单字和多字术语提取相比,与关键字提取措施相关的C值的适当谐波均值可以提供更好的精度。实验描述了从在线可用的实验室测试语料库中提取英语和法语生物医学术语的方法。通过使用UMLS(英语)和仅MeSH(法语)作为参考词典来验证结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号