首页> 外文期刊>Machine translation >An error analysis for image-based multi-modal neural machine translation
【24h】

An error analysis for image-based multi-modal neural machine translation

机译:基于图像的多模态神经机器翻译的错误分析

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we conduct an extensive quantitative error analysis of different multi-modal neural machine translation (MNMT) models which integrate visual features into different parts of both the encoder and the decoder. We investigate the scenario where models are trained on an in-domain training data set of parallel sentence pairs with images. We analyse two different types of MNMT models, that use global and local image features: the latter encode an image globally, i.e. there is one feature vector representing an entire image, whereas the former encode spatial information, i.e. there are multiple feature vectors, each encoding different portions of the image. We conduct an error analysis of translations generated by different MNMT models as well as text-only baselines, where we study how multi-modal models compare when translating both visual and non-visual terms. In general, we find that the additional multi-modal signals consistently improve translations, even more so when using simpler MNMT models that use global visual features. We also find that not only translations of terms with a strong visual connotation are improved, but almost all kinds of errors decreased when using multi-modal models.
机译:在本文中,我们对不同的多模式神经机器翻译(MNMT)模型进行了广泛的定量误差分析,这些模型将视觉功能集成到编码器和解码器的不同部分。我们调查了在带有图像的并行句子对的域内训练数据集上训练模型的场景。我们分析了两种不同类型的MNMT模型,它们使用全局和局部图像特征:后者对全局图像进行编码,即存在一个表示整个图像的特征向量,而前者对空间信息进行编码,即存在多个特征向量,每个编码图像的不同部分。我们对由不同的MNMT模型以及纯文本基线生成的翻译进行错误分析,在此我们研究在翻译视觉和非视觉术语时如何比较多模式模型。通常,我们发现附加的多模式信号始终如一地改善翻译效果,在使用使用全局视觉功能的简单MNMT模型时,效果更是如此。我们还发现,使用多模式模型不仅可以改善具有强烈视觉含义的术语的翻译,而且可以减少几乎所有类型的错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号