Modeling visual and word-conditional semantic attention for image captioning

Wu Chunlei; Wei Yiwei; Chu Xiaoliang; Su Fei; Wang Leiquan

首页> 外文期刊>Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing >Modeling visual and word-conditional semantic attention for image captioning

【24h】

Modeling visual and word-conditional semantic attention for image captioning

机译：模拟图像标题的视觉和单词条件语义关注

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extensive efforts have been focused on attention-based frameworks for image captioning, which have achieved good performances when the generated words have an explicit corresponding with the image region. However, the generation of functional words, such as "on", "of", have not been investigated. In this paper, a dual temporal modal is first proposed for image captioning to address the role of visual information on every time step. Based on the dual temporal modal, word-conditional semantic attention is also proposed to solve the problem of functional words generation. Finally, a balance strategy is adopted on the basis of the attention variation to make a trade off between visual attention and word-conditional semantic attention. Extensive experiments are conducted on Flicicr30k and COCO dataset to validate the effectiveness of the proposed method.

机译：广泛的努力已经专注于基于注意的图像标题框架，当生成的单词具有与图像区域的显式对应时，这已经实现了良好的性能。然而，尚未调查“在”，“ON”中的功能词的产生。在本文中，首先提出用于图像标题的双时间模态，以解决每次步骤的视觉信息的作用。基于双颞模态，还提出了单词条件语义关注来解决功能词生成的问题。最后，在注意力变化的基础上采用了平衡策略，以在视觉关注和单词条件的语义关注之间进行折衷。广泛的实验是在Flicicr30K和Coco DataSet上进行的，以验证该方法的有效性。

著录项

来源
《Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing 》 |2018年第2018期| 共8页
作者
Wu Chunlei; Wei Yiwei; Chu Xiaoliang; Su Fei; Wang Leiquan;
展开▼
作者单位

China Univ Petr East China Coll Comp &

Commun Engn Qingdao Peoples R China;

China Univ Petr East China Coll Comp &

Commun Engn Qingdao Peoples R China;

China Univ Petr East China Coll Comp &

Commun Engn Qingdao Peoples R China;

Beijing Univ Posts &

Telecommun Sch Informat &

Commun Engn Beijing Peoples R China;

China Univ Petr East China Coll Comp &

Commun Engn Qingdao Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类图像通信、多媒体通信 ; 通信 ;
关键词
Image captioning; Word-conditional semantic attention; Visual attention; Attention variation;

机译：图像标题;单词条件语义关注;视觉注意;注意变化;

相似文献

外文文献
中文文献
专利

1. Modeling visual and word-conditional semantic attention for image captioning [J] . Wu Chunlei, Wei Yiwei, Chu Xiaoliang, Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2018 ,第期

机译：模拟图像标题的视觉和单词条件语义关注
2. Image Captioning With Visual-Semantic Double Attention [J] . He Chen, Hu Haifeng ACM transactions on multimedia computing communications and applications . 2019 ,第1期

机译：视觉语义双重注意的图像字幕
3. High-Quality Image Captioning With Fine-Grained and Semantic-Guided Visual Attention [J] . Zhang Zongjian, Wu Qiang, Wang Yang, IEEE transactions on multimedia . 2019 ,第7期

机译：具有细粒度和语义引导的视觉注意的高质量图像字幕
4. Image Captioning Based on Visual and Semantic Attention [C] . Haiyang Wei, Zhixin Li, Canlong Zhang International Conference on Multimedia Modeling . 2020

机译：基于视觉和语义注意的图像字幕
5. Modeling the control of attention by visual and semantic factors in real-world scenes. [D] . Hwang, Alex Daejoon. 2010

机译：通过现实世界中的视觉和语义因素对注意力控制进行建模。
6. Social Image Captioning: Exploring Visual Attention and User Attention [O] . Leiquan Wang, Xiaoliang Chu, Weishan Zhang, 2018

机译：社交图像字幕：探索视觉注意力和用户注意力
7. Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation [O] . Ling Cheng, Wei Wei, Xianling Mao, 2020

机译：Stack-VS：图像字幕生成的堆叠视觉语义关注

Modeling visual and word-conditional semantic attention for image captioning

摘要

著录项

相似文献

相关主题

期刊订阅