首页> 外文期刊>Multimedia Tools and Applications >Weakly-supervised image captioning based on rich contextual information
【24h】

Weakly-supervised image captioning based on rich contextual information

机译:基于丰富上下文信息的弱监督图像字幕

获取原文
获取原文并翻译 | 示例

摘要

Automatically generation of an image description is a challenging task which attracts broad attention in artificial intelligence. Inspired by methods of computer vision and natural language processing, different approaches have been proposed to solve the problem. However, captions generated by the existing approaches have been lack of enough contextual information to describe the corresponding images completely. The labeled captions in the training set only basically describe images and lack of enough contextual annotations. In this paper, we propose a Weakly-supervised Image Captioning Approach (WICA) to generate captions containing rich contextual information, without complete annotations for the contextual information in datasets. We utilize encoder-decoder neural networks to extract basic captioning features and leverage object detection networks to identify contextual features. Then, we encode the two levels of features by a phrase-based language model in order to generate captions with rich contextual information. The comprehensive experimental results reveal that proposed model outperforms the existing baselines in terms of on the richness and reasonability of contextual information for image captioning.
机译:自动生成图像描述是一项具有挑战性的任务,在人工智能领域引起了广泛关注。受计算机视觉和自然语言处理方法的启发,已提出了不同的方法来解决该问题。但是,现有方法生成的字幕缺少足够的上下文信息来完整描述相应的图像。训练集中带有标签的标题基本上仅描述图像,并且缺少足够的上下文注释。在本文中,我们提出了一种弱监督图像字幕方法(WICA),以生成包含丰富上下文信息的字幕,而无需为数据集中的上下文信息提供完整的注释。我们利用编码器-解码器神经网络来提取基本字幕功能,并利用对象检测网络来识别上下文特征。然后,我们通过基于短语的语言模型对功能的两个级别进行编码,以生成具有丰富上下文信息的字幕。全面的实验结果表明,在用于图像字幕的上下文信息的丰富性和合理性方面,所提出的模型优于现有基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号