【24h】

Source Language Adaptation for Resource-Poor Machine Translation

机译:资源贫乏的机器翻译的源语言改编

获取原文

摘要

We propose a novel, language-independent approach for improving machine translation from a resource-poor language to X by adapting a large bi-text for a related resource-rich language and X (the same target language). We assume a small bi-text for the resource-poor language to X pair, which we use to learn word-level and phrase-level paraphrases and cross-lingual morphological variants between the resource-rich and the resource-poor language; we then adapt the former to get closer to the latter. Our experiments for Indonesian/Malay-English translation show that using the large adapted resource-rich bi-text yields 6.7 BLEU points of improvement over the unadapted one and 2.6 BLEU points over the original small bi-text. Moreover, combining the small bi-text with the adapted bi-text outperforms the corresponding combinations with the unadapted bi-text by 1.5-3 BLEU points. We also demonstrate applicability to other languages and domains.
机译:我们提出了一种新颖的,与语言无关的方法,通过将大型的双向文本改编为相关的资源丰富的语言和X(相同的目标语言),从而将机器翻译从资源贫乏的语言改进为X。我们假设资源匮乏的语言与X对使用小的文本,用来学习单词级和短语层次的复述以及资源贫乏的语言与资源贫乏的语言之间的跨语言形态学变体;然后,我们调整前者以使其更接近后者。我们对印度尼西亚语/马来语-英语翻译的实验表明,使用大量经过改编的资源丰富的双向文本,与未采用的一种双向文本相比,改进了6.7个BLEU点,而与原始的较小双文本相比,提高了2.6个BLEU点。而且,将小文本与改编后的文本组合起来比对应的与未改编的文本组合要好1.5-3个BLEU点。我们还展示了对其他语言和领域的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号