首页> 外文会议>International Conference on Multimedia Big Data >Reference Based on Adaptive Attention Mechanism for Image Captioning*

【24h】

Reference Based on Adaptive Attention Mechanism for Image Captioning*

机译：基于自适应注意力机制的参考图像字幕*

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image captioning, as a conjunction of computer vision and natural language processing, is receiving increasing attention in recent years. Most existing methods are using the attention-based CNN-RNN frameworks, which can understand images effectively and generate more natural descriptions. The recent research discovers that the expression of an event can effectively promote people's understanding and result in successful detection in objects and actions. Inspired by the textual information, in this paper, we propose a Reference based on Adaptive Attention Mechanism (R-AAM) model adding the reference sentence into the attention system to correct the area where the image is concentrated. Both in the training and testing processes, the reference sentence is selected by computing the largest consensus score among the nearest images in the training data set. The reference sentence associated with an image can help select the highlight region for the generating word in time sequence. The result has shown that the generated sentences utilizing the reference sentence express the richer semantic information and fix the wrong recognition phenomenon. Our proposed R-AAM method achieves comparable performances on the well-known public dataset MSCOCO with five popular evaluation metrics.

机译：图像字幕作为计算机视觉和自然语言处理的结合，近年来受到越来越多的关注。现有的大多数方法都使用基于注意力的CNN-RNN框架，该框架可以有效地理解图像并生成更自然的描述。最近的研究发现，事件的表达可以有效地增进人们的理解，并成功地检测出对象和动作。受文本信息的启发，本文提出了一种基于自适应注意机制（R-AAM）模型的参考，在参考系统中添加参考句子以校正图像集中的区域。在训练和测试过程中，都通过计算训练数据集中最近的图像中最大的共识分数来选择参考句子。与图像关联的参考句子可以帮助按时间顺序选择生成单词的突出显示区域。结果表明，利用参考句子生成的句子表达了更丰富的语义信息，并纠正了错误的识别现象。我们提出的R-AAM方法在著名的公共数据集MSCOCO上具有五种流行的评估指标，可达到可比的性能。

著录项

来源
《International Conference on Multimedia Big Data 》|2018年|1-8|共8页
会议地点
作者
Shuang Liu; Liang Bai; Yanming Guo; Haoran Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Adaptation models; Task analysis; Feature extraction; Training; Computational modeling;

机译：可视化;适应模型;任务分析;特征提取;训练;计算模型;

相似文献

外文文献
中文文献
专利

1. Adaptive Attention-based High-level Semantic Introduction for Image Caption [J] . Liu Xiaoxiao, Xu Qingyang ACM transactions on multimedia computing communications and applications . 2020 ,第4期

机译：基于自适应的图像标题的高级语义介绍
2. Image Captioning Using Region-Based Attention Joint with Time-Varying Attention [J] . Wang Weixuan, Hu Haifeng Neural processing letters . 2019 ,第1期

机译：使用基于区域的注意力联合时变注意力的图像字幕
3. Image Captioning Using Region-Based Attention Joint with Time-Varying Attention [J] . Wang Weixuan, Hu Haifeng Neural processing letters . 2019 ,第1期

机译：使用基于区域的注意力关节与时变关节的图像标题
4. Reference Based on Adaptive Attention Mechanism for Image Captioning* [C] . Shuang Liu, Liang Bai, Yanming Guo, IEEE International Conference on Multimedia Big Data . 2018

机译：基于图像标题的自适应注意机制的参考*
5. Arabic Image Captioning Using Deep Learning with Attention [D] . Sabri, Sabri Monaf. 2021

机译：使用深入学习的阿拉伯语图像标题
6. Social Image Captioning: Exploring Visual Attention and User Attention [O] . Leiquan Wang, Xiaoliang Chu, Weishan Zhang, 2018

机译：社交图像字幕：探索视觉注意力和用户注意力
7. Variational Autoencoder-Based Multiple Image Captioning Using a Caption Attention Map [O] . Boeun Kim, Saim Shin, Hyedong Jung 2019

机译：使用标题注意图的基于变化的自动统计器的多个图像标题

Reference Based on Adaptive Attention Mechanism for Image Captioning*

摘要

著录项

相似文献

相关主题

期刊订阅