【24h】

Exploring cross-language statistical machine translation for closely related South Slavic languages

机译:探索与密切相关南斯拉夫语言的交叉语言统计机器翻译

获取原文

摘要

This work investigates the use of cross-language resources for statistical machine translation (SMT) between English and two closely related South Slavic languages, namely Croatian and Serbian. The goal is to explore the effects of translating from and into one language using an SMT system trained on another. For translation into English, a loss due to cross-translation is about 13% of bleu and for the other translation direction about 15%. The performance decrease for both languages in both translation directions is mainly due to lexical divergences. Several language adaptation methods are explored, and it is shown that very simple lexical transformations already can yield a small improvement, and that the most promising adaptation method is using a Croatian-Serbian SMT system trained on a very small corpus.
机译:这项工作调查了英语与两种密切相关的南斯拉夫语言之间的统计机器翻译(SMT)的跨语言资源的使用,即克罗地亚和塞尔维亚。 目标是探讨使用另一个培训的SMT系统翻译和进入一种语言的效果。 对于英语翻译,由于跨平移导致的损失是Bleu的13%,而其他翻译方向约为15%。 两种语言两种翻译方向的性能降低主要是由于词汇分歧。 探索了几种语言适应方法,并显示出非常简单的词汇转换已经产生了较小的改进,并且最有前途的适应方法正在使用培训在非常小的语料库上培训的克罗地亚塞尔维亚SMT系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号