首页> 外文会议>Workshop on Language Technologies for Historical and Ancient Languages >Latin-Spanish Neural Machine Translation: from the Bible to Saint Augustine
【24h】

Latin-Spanish Neural Machine Translation: from the Bible to Saint Augustine

机译:拉丁西班牙神经机器翻译:从圣经到圣奥古斯丁

获取原文

摘要

Although there are several sources where to find historical texts, they usually are available in the original language that makes them generally inaccessible. This paper presents the development of state-of-the-art Neural Machine Systems for the low-resourced Latin-Spanish language pair. First, we build a Transformer-based Machine Translation system on the Bible parallel corpus. Then, we build a comparable corpus from Saint Augustine texts and their translations. We use this corpus to study the domain adaptation case from the Bible texts to Saint Augustine's works. Results show the difficulties of handling a low-resourced language as Latin. First, we noticed the importance of having enough data, since the systems do not achieve high BLEU scores. Regarding domain adaptation, results show how using in-domain data helps systems to achieve a better quality translation. Also, we observed that it is needed a higher amount of data to perform an effective vocabulary extension that includes in-domain vocabulary.
机译:尽管有许多资料可以找到历史文本,但通常都以原始语言提供它们,因此通常无法访问它们。本文介绍了针对资源匮乏的拉丁-西班牙语言对的最新神经机器系统的开发。首先,我们在圣经平行语料库上构建基于Transformer的机器翻译系统。然后,我们从圣奥古斯丁文本及其翻译中建立可比的语料库。我们使用该语料库来研究从圣经文本到圣奥古斯丁著作的领域适应案例。结果表明,在处理资源贫乏的拉丁语时遇到了困难。首先,我们注意到拥有足够数据的重要性,因为系统无法获得较高的BLEU分数。关于域适应,结果表明使用域内数据如何帮助系统实现更好的质量转换。此外,我们观察到需要大量数据才能执行有效的词汇扩展,包括域内词汇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号