...
首页> 外文期刊>Natural language engineering >A classification approach for detecting cross-lingual biomedical term translations
【24h】

A classification approach for detecting cross-lingual biomedical term translations

机译:用于检测跨语言生物医学术语翻译的分类方法

获取原文
获取原文并翻译 | 示例
           

摘要

Finding translations for technical terms is an important problem in machine translation. In particular, in highly specialized domains such as biology or medicine, it is difficult to find bilingual experts to annotate sufficient cross-lingual texts in order to train machine translation systems. Moreover, new terms are constantly being generated in the biomedical community, which makes it difficult to keep the translation dictionaries up to date for all language pairs of interest. Given a biomedical term in one language (source language), we propose a method for detecting its translations in a different language (target language). Specifically, we train a binary classifier to determine whether two biomedical terms written in two languages are translations. Training such a classifier is often complicated due to the lack of common features between the source and target languages. We propose several feature space concatenation methods to successfully overcome this problem. Moreover, we study the effectiveness of contextual and character n-gram features for detecting term translations. Experiments conducted using a standard dataset for biomedical term translation show that the proposed method outperforms several competitive baseline methods in terms of mean average precision and top-k translation accuracy.
机译:查找技术术语的翻译是机器翻译中的重要问题。尤其是在生物学或医学等高度专业化的领域中,很难找到双语专家来注释足够的跨语言文本以训练机器翻译系统。此外,生物医学界不断产生新的术语,这使得很难使所有感兴趣的语言对的翻译词典保持最新。给定一种语言(源语言)的生物医学术语,我们提出了一种用于检测另一种语言(目标语言)的翻译的方法。具体来说,我们训练一个二进制分类器来确定以两种语言编写的两个生物医学术语是否为翻译。由于源语言和目标语言之间缺乏通用功能,因此训练这样的分类器通常很复杂。我们提出了几种特征空间级联方法来成功克服此问题。此外,我们研究了上下文和字符n-gram特征对检测术语翻译的有效性。使用标准数据集进行生物医学术语翻译的实验表明,该方法在平均平均精度和top-k翻译精度方面优于几种竞争性基线方法。

著录项

  • 来源
    《Natural language engineering》 |2017年第1期|31-51|共21页
  • 作者

    H. HAKAMI; D. BOLLEGALA;

  • 作者单位

    Computer Science Department, Taif University, Saudi Arabia;

    Department of Computer Science, The University of Liverpool, UK;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号