Adaptive Attention-based High-level Semantic Introduction for Image Caption

Liu Xiaoxiao; Xu Qingyang

首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Adaptive Attention-based High-level Semantic Introduction for Image Caption

【24h】

Adaptive Attention-based High-level Semantic Introduction for Image Caption

机译：基于自适应的图像标题的高级语义介绍

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

There have been several attempts to integrate a spatial visual attention mechanism into an image caption model and introduce semantic concepts as the guidance of image caption generation. High-level semantic information consists of the abstractedness and generality indication of an image, which is beneficial to improve the model performance. However, the high-level information is always static representation without considering the salient elements. Therefore, a semantic attention mechanism is used for the high-level information instead of conventional of static representation in this article. The salient high-level semantic information can be considered as redundant semantic information for image caption generation. Additionally, the generation of visual words and non-visual words can be separated, and an adaptive attention mechanism is employed to realize the guidance information of image caption generation switching between new fusion information (fusion of image feature and high-level semantics) and a language model. Therefore, visual words can be generated according to the image features and high-level semantic information, and non-visual words can be predicted by the language model. The semantics attention, adaptive attention, and previous generated words are fused to construct a special attention module for the input and output of long short-term memory. An image caption can be generated as a concise sentence on the basis of accurately grasping the rich content of the image. The experimental results show that the performance of the proposed model is promising for the evaluation metrics, and the captions can achieve logical and rich descriptions.

机译：已经有几次尝试将空间视觉注意机制集成到图像字幕模型中，并将语义概念引入图像标题生成的指导。高级语义信息由图像的抽象性和一般性指示组成，这是有利于提高模型性能的。但是，在不考虑突出元件的情况下，高级信息始终是静态表示。因此，语义关注机制用于高级信息而不是本文中的静态表示传统。突出的高级语义信息可以被认为是图像标题生成的冗余语义信息。另外，可以分离视觉单词和非视觉词的产生，并且采用自适应注意机制来实现新融合信息（图像特征和高级语义的融合）之间的图像字幕生成切换的引导信息和一个语言模型。因此，可以根据图像特征和高电平语义信息生成视觉词语，并且可以通过语言模型预测非视觉词。语义注意，自适应关注和先前生成的单词融合以构建一个特别的注意模块，用于长短短期内存的输入和输出。可以基于精确地抓取图像的丰富内容来生成图像标题作为简明句子。实验结果表明，该模型的性能对评估指标有望，标题可以实现逻辑和丰富的描述。

著录项

来源
《ACM transactions on multimedia computing communications and applications》 |2020年第4期|128.1-128.22|共22页
作者
Liu Xiaoxiao; Xu Qingyang;
展开▼
作者单位

Shandong Univ Sch Mech Elect & Informat Engn Weihai 264209 Shandong Peoples R China;

Shandong Univ Sch Mech Elect & Informat Engn Weihai 264209 Shandong Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Image caption; high-level semantic; adaptive attention; CNN; LSTM; visual sentinel;

机译：图像标题;高级语义;自适应注意;CNN;LSTM;Visual Sentinel;

相似文献

外文文献
中文文献
专利

1. Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory [J] . Cao Pengfei, Yang Zhongyi, Sun Liang, Neural processing letters . 2019,第1期

机译：基于双向语义注意的长短期记忆引导图像字幕
2. Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory [J] . Cao Pengfei, Yang Zhongyi, Sun Liang, Neural processing letters . 2019,第1期

机译：具有双向语义关注的长短期记忆的图像标题
3. Video Captioning With Attention-Based LSTM and Semantic Consistency [J] . Lianli Gao, Zhao Guo, Hanwang Zhang, IEEE transactions on multimedia . 2017,第9期

机译：具有基于注意的LSTM和语义一致性的视频字幕
4. Regenerating Image Caption with High-Level Semantics [C] . Wei-Dong Tian, Nan-Xun Wang, Yue-Lin Sun, International Conference on Intelligent Computing . 2020

机译：用高级语义再生图像标题
5. Generation of Humorous Caption for Cartoon Images Using Deep Learning [D] . Shanmuga Sundaram, Rajesh. 2018

机译：使用深度学习的卡通形象的幽默标题
6. Automated Semantic Indexing of Figure Captions to Improve Radiology Image Retrieval [O] . Charles E. Kahn Jr., Daniel L. Rubin 2009

机译：图形字幕的自动语义索引可改善放射图像的检索
7. Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory [O] . Pengfei Cao, Zhongyi Yang, Liang Sun, 2019

机译：具有双向语义关注的长短期记忆的图像标题

Adaptive Attention-based High-level Semantic Introduction for Image Caption

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅