首页> 外文会议>Language technology for closely related languages and language variants workshop 2014 >Exploring cross-language statistical machine translation for closely related South Slavic languages
【24h】

Exploring cross-language statistical machine translation for closely related South Slavic languages

机译:探索与南斯拉夫语言密切相关的跨语言统计机器翻译

获取原文
获取原文并翻译 | 示例

摘要

This work investigates the use of cross-language resources for statistical machine translation (SMT) between English and two closely related South Slavic languages, namely Croatian and Serbian. The goal is to explore the effects of translating from and into one language using an SMT system trained on another. For translation into English, a loss due to cross-translation is about 13% of bleu and for the other translation direction about 15%. The performance decrease for both languages in both translation directions is mainly due to lexical divergences. Several language adaptation methods are explored, and it is shown that very simple lexical transformations already can yield a small improvement, and that the most promising adaptation method is using a Croatian-Serbian SMT system trained on a very small corpus.
机译:这项工作调查了跨语言资源在英语与两种紧密相关的南斯拉夫语言(克罗地亚语和塞尔维亚语)之间进行统计机器翻译(SMT)的使用。目的是探索使用在另一种语言上受训的SMT系统将语言翻译成一种语言的效果。对于英语翻译,由于交叉翻译而造成的损失约为布鲁的13%,而对于其他翻译方向,则约为15%。两种语言在两种翻译方向上的性能下降主要是由于词汇差异。探索了几种语言适应方法,结果表明非常简单的词法转换已经可以带来很小的改进,最有希望的适应方法是使用在很小的语料库上训练的克罗地亚语-塞尔维亚语SMT系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号