首页> 外文会议>International conference on graphic and image processing >Generating Description with Multi-feature Fusion and Saliency Maps of Image
【24h】

Generating Description with Multi-feature Fusion and Saliency Maps of Image

机译:使用图像的多特征融合和显着性图生成描述

获取原文

摘要

Generating description for an image can be regard as visual understanding. It is across artificial intelligence, machine learning, natural language processing and many other areas. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with object attention and multi-feature of images. The deep recurrent neural networks have excellent performance in machine translation, so we use it to generate natural sentence description for images. The proposed method uses single CNN (convolution neural network) that is trained on ImageNet to extract image features. But we think it can not adequately contain the content in images, it may only focus on the object area of image. So we add scene information to image feature using CNN which is trained on Places205. Experiments show that model with multi-feature extracted by two CNNs perform better than which with a single feature. In addition, we make saliency weights on images to emphasize the salient objects in images. We evaluate our model on MSCOCO based on public metrics, and the results show that our model performs better than several state-of-the-art methods.
机译:生成图像描述可以视为视觉理解。它涉及人工智能,机器学习,自然语言处理和许多其他领域。在本文中,我们提出了一个基于RNN(递归神经网络)为图像生成描述的模型,该模型具有对象注意力和图像的多特征。深度递归神经网络在机器翻译中具有出色的性能,因此我们使用它来生成图像的自然句子描述。所提出的方法使用在ImageNet上训练的单个CNN(卷积神经网络)来提取图像特征。但是我们认为它不能充分包含图像中的内容,它可能只关注图像的对象区域。因此,我们使用在Places205上训练的CNN将场景信息添加到图像特征中。实验表明,由两个CNN提取的具有多个特征的模型的性能要优于具有单个特征的模型。另外,我们对图像进行显着权重,以强调图像中的显着对象。我们基于公共指标对MSCOCO进行了模型评估,结果表明,该模型的性能优于几种最先进的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号