首页> 外文会议>IEEE International Conference on Multimedia and Expo >Refining Attention: A Sequential Attention Model for Image Captioning

【24h】

Refining Attention: A Sequential Attention Model for Image Captioning

机译：提炼注意力：图像字幕的顺序注意力模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Visual attention is widely applied to image captioning. Previous works put visual attention and linguistic word into a long short-term memory network together, but neglect the sequential relation of attention at different time steps during word prediction. Moreover, the abstraction degree of visual attention is usually different from that of linguistic word. To address these issues, a sequential attention model is proposed in this work to handle visual attention by considering the corresponding sequential relation, and hence the internal relation among attention at each word prediction step is well utilized to enhance the visual information during sentence decoding. The experimental results on the benchmark MSCOCO and Flickr30K datasets show that the proposed model achieves excellent performances with 108.1 and 34.9 respectively on the evaluation criteria of CIDEr and BLEU-4 for MSCOCO.

机译：视觉注意力已广泛应用于图像字幕。先前的作品将视觉注意力和语言单词整合到一个长的短期记忆网络中，但是忽略了单词预测过程中不同时间步长的注意力顺序关系。此外，视觉注意的抽象程度通常与语言单词的抽象程度不同。为了解决这些问题，在本文中提出了一种顺序注意模型，通过考虑相应的顺序关系来处理视觉注意，因此在每个单词预测步骤中注意之间的内部关系被很好地利用来增强句子解码过程中的视觉信息。在基准MSCOCO和Flickr30K数据集上的实验结果表明，所提出的模型在CIDEr和BLEU-4的MSCOCO评估标准上分别具有108.1和34.9的优异性能。

著录项

来源
《IEEE International Conference on Multimedia and Expo 》|2018年|1-6|共6页
会议地点
作者
Fang Fang; Qinyu Li; Hanli Wang; Pengjie Tang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Feature extraction; Training; Linguistics; Decoding; Semantics; Data mining;

机译：可视化;特征提取;训练;语言学;解码;语义;数据挖掘;

相似文献

外文文献
中文文献
专利

1. The synergy of double attention: Combine sentence-level and word-level attention for image captioning [J] . Haiyang Wei, Zhixin Li, Canlong Zhang, Computer vision and image understanding . 2020 ,第Deca期

机译：双重关注的协同作用：相结合句子水平和单词级别的图像标题
2. Image Captioning Using Region-Based Attention Joint with Time-Varying Attention [J] . Wang Weixuan, Hu Haifeng Neural processing letters . 2019 ,第1期

机译：使用基于区域的注意力联合时变注意力的图像字幕
3. Image Captioning Using Region-Based Attention Joint with Time-Varying Attention [J] . Wang Weixuan, Hu Haifeng Neural processing letters . 2019 ,第1期

机译：使用基于区域的注意力关节与时变关节的图像标题
4. Refining Attention: A Sequential Attention Model for Image Captioning [C] . Fang Fang, Qinyu Li, Hanli Wang, IEEE International Conference on Multimedia and Expo . 2018

机译：精炼注意力：图像标题的连续关注模型
5. Arabic Image Captioning Using Deep Learning with Attention [D] . Sabri, Sabri Monaf. 2021

机译：使用深入学习的阿拉伯语图像标题
6. Social Image Captioning: Exploring Visual Attention and User Attention [O] . Leiquan Wang, Xiaoliang Chu, Weishan Zhang, 2018

机译：社交图像字幕：探索视觉注意力和用户注意力
7. Attention on Attention for Image Captioning [O] . Lun Huang, Wenmin Wang, Jie Chen, 2019

机译：注意图像标题的注意力

Refining Attention: A Sequential Attention Model for Image Captioning

摘要

著录项

相似文献

相关主题

期刊订阅