首页> 外文期刊>Machine translation >An error analysis for image-based multi-modal neural machine translation
【24h】

An error analysis for image-based multi-modal neural machine translation

机译:基于图像的多模态神经电机翻译错误分析

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we conduct an extensive quantitative error analysis of different multi-modal neural machine translation (MNMT) models which integrate visual features into different parts of both the encoder and the decoder. We investigate the scenario where models are trained on an in-domain training data set of parallel sentence pairs with images. We analyse two different types of MNMT models, that use global and local image features: the latter encode an image globally, i.e. there is one feature vector representing an entire image, whereas the former encode spatial information, i.e. there are multiple feature vectors, each encoding different portions of the image. We conduct an error analysis of translations generated by different MNMT models as well as text-only baselines, where we study how multi-modal models compare when translating both visual and non-visual terms. In general, we find that the additional multi-modal signals consistently improve translations, even more so when using simpler MNMT models that use global visual features. We also find that not only translations of terms with a strong visual connotation are improved, but almost all kinds of errors decreased when using multi-modal models.
机译:在本文中,我们对不同的多模态神经机翻译(MNMT)模型进行了广泛的定量误差分析,将可视特征集成为编码器和解码器的不同部分。我们调查模型在具有图像的并行句子对的域培训数据集培训的情况下培训。我们分析了两种不同类型的MNMT模型,该模型使用全局和局部图像特征:后者在全局中编码图像,即有一个特征向量表示整个图像,而前者编码空间信息,即有多个特征向量编码图像的不同部分。我们对不同MNMT模型生成的翻译的错误分析以及仅限文本基准,在那里我们研究了在翻译视觉和非视觉术语时如何比较多模型模型。一般来说,我们发现额外的多模态信号一致地改善翻译,即使使用使用全局视觉功能的简单MNMT型号,也可以更好地改善翻译。我们还发现,不仅改进了具有强烈视觉内涵的术语的翻译,但使用多模态模型时几乎各种错误都会减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号