首页> 外国专利> Adaptive attention model for image captioning

Adaptive attention model for image captioning

机译：用于图像字幕的自适应注意力模型

页面导航

摘要
著录项
相似文献

摘要

The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

机译：公开的技术提出了一种新颖的空间注意力模型，该模型使用解码器长短期存储器（LSTM）的当前隐藏状态信息来引导注意力并提取用于图像字幕的空间图像特征。所公开的技术还提出了一种用于图像字幕的新颖的自适应注意力模型，该模型将来自卷积神经网络（CNN）的视觉信息和来自LSTM的语言信息混合在一起。在每个时间步长，自适应注意力模型都会自动决定与语言模型相对的依赖图像的强度，以发出下一个字幕单词。公开的技术还向LSTM体系结构添加了新的辅助哨兵门，并生产了哨兵LSTM（Sn-LSTM）。哨兵门在每个时间步都会产生一个视觉哨兵，这是从LSTM的内存中获得的长期和短期视觉和语言信息的另一种表示形式。

著录项

公开/公告号US10565305B2

专利类型
公开/公告日2020-02-18

原文格式PDF
申请/专利权人 SALESFORCE.COM INC.;
展开▼

申请/专利号US201715817161
发明设计人 JIASEN LU;CAIMING XIONG;RICHARD SOCHER;
展开▼

申请日2017-11-17
分类号G06K9;G06F17/27;G06K9/62;G06K9/46;G06F17/24;G06K9/48;G06K9/66;G06N3/08;G06N3/04;
国家 US
入库时间 2022-08-21 11:29:48

相似文献

专利
外文文献
中文文献