首页> 外文会议>10th Workshop on multiword expressions >German Compounds and Statistical Machine Translation. Can they get along?
【24h】

German Compounds and Statistical Machine Translation. Can they get along?

机译:德语化合物和统计机器翻译。他们可以相处吗?

获取原文
获取原文并翻译 | 示例

摘要

This paper reports different experiments created to study the impact of using linguistics to preprocess German compounds prior to translation in Statistical Machine Translation (SMT). Compounds are a known challenge both in Machine Translation (MT) and Translation in general as well as in other Natural Language Processing (NLP) applications. In the case of SMT, German compounds are split into their constituents to decrease the number of unknown words and improve the results of evaluation measures like the Bleu score. To assess to which extent it is necessary to deal with German compounds as a part of preprocessing in SMT systems, we have tested different compound splitters and strategies, such as adding lists of compounds and their translations to the training set. This paper summarizes the results of our experiments and attempts to yield better translations of German nominal compounds into Spanish and shows how our approach improves by up to 1.4 Bleu points with respect to the baseline.
机译:本文报道了创建的不同实验,以研究在统计机器翻译(SMT)中使用语言学对德语化合物进行预处理的影响。化合物是机器翻译(MT)和一般翻译以及其他自然语言处理(NLP)应用程序中的已知挑战。在SMT的情况下,德语化合物被分成成分以减少未知单词的数量并改善Bleu评分等评估指标的结果。为了评估在SMT系统中作为预处理的一部分处理德语化合物的必要性,我们测试了不同的化合物拆分器和策略,例如将化合物列表及其翻译添加到培训集中。本文总结了我们的实验结果,并尝试将德语标称化合物更好地翻译成西班牙语,并说明了我们的方法相对于基线如何提高了1.4个Bleu点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号