首页> 外文会议>IEEE International Conference on Multimedia Expo Workshops >What Topics Do Images Say: A Neural Image Captioning Model with Topic Representation
【24h】

What Topics Do Images Say: A Neural Image Captioning Model with Topic Representation

机译:图像说什么主题:具有主题表示的神经图像字幕模型

获取原文

摘要

Image captioning aims to generate descriptions of images with natural language sentences automatically. Most methods tackle this problem in an end-to-end fashion in recent years, which generates captions directly from image level features but ignores high-level semantic information. The method that introduced attribute concept into the CNN-RNN framework made a considerable improvement while the performance depended on the manually selected attributes heavily. In this paper, we propose a topic-guided neural image captioning model which incorporates a topic model into the CNN-RNN framework. Our model represents each image as a set of topics and each topic as various words with relevant distributions. We conduct experiments on Microsoft COCO dataset. The results show that our model outperforms the baselines and achieves promising performance. It verifies that the topic features are effective to represent high-level semantic information of images.
机译:图像字幕的目的是自动生成带有自然语言句子的图像描述。近年来,大多数方法都以端到端的方式解决了这个问题,该方法直接从图像级功能生成字幕,但忽略了高级语义信息。将属性概念引入到CNN-RNN框架中的方法进行了相当大的改进,而性能则严重依赖于手动选择的属性。在本文中,我们提出了一个主题指导的神经图像字幕模型,该模型将主题模型合并到了CNN-RNN框架中。我们的模型将每个图像表示为一组主题,将每个主题表示为具有相关分布的各个单词。我们对Microsoft COCO数据集进行实验。结果表明,我们的模型优于基线,并实现了令人满意的性能。它验证了主题特征可以有效地表示图像的高级语义信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号