首页> 外文期刊>Terminology >TermEnsembler: An ensemble learning approach to bilingual term extraction and alignment
【24h】

TermEnsembler: An ensemble learning approach to bilingual term extraction and alignment

机译:TermEnsembler:双语术语提取和对齐的整体学习方法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This paper describes TermEnsembler, a bilingual term extraction and alignment system utilizing a novel ensemble learning approach to bilingual term alignment. In the proposed system, the processing starts with monolingual term extraction from a language industry standard file type containing aligned English and Slovenian texts. The two separate term lists are then automatically aligned using an ensemble of seven bilingual alignment methods, which are first executed separately and then merged using the weights learned with an evolutionary algorithm. In the experiments, the weights were learned on one domain and tested on two other domains. When evaluated on the top 400 aligned term pairs, the precision of term alignment is over 96%, while the number of correctly aligned multi-word unit terms exceeds 30% when evaluated on the top 400 term pairs.
机译:本文介绍了TermEnsembler,这是一个双语术语提取和对齐系统,它利用一种新颖的整体学习方法来进行双语术语对齐。在所提出的系统中,处理始于从包含对齐的英语和斯洛文尼亚文本的语言行业标准文件类型中提取单语术语。然后使用七个双语对齐方法的集合自动对齐两个单独的术语列表,这些方法首先分别执行,然后使用通过进化算法学习的权重进行合并。在实验中,权重是在一个域上学习的,而在另外两个域上进行了测试。当对前400个词对进行评估时,词对齐的精度超过96%,而对前400个词对进行评估,正确对齐的多词单元词的数量超过30%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号