...
【24h】

Modeling visual and word-conditional semantic attention for image captioning

机译:模拟图像标题的视觉和单词条件语义关注

获取原文
获取原文并翻译 | 示例

摘要

Extensive efforts have been focused on attention-based frameworks for image captioning, which have achieved good performances when the generated words have an explicit corresponding with the image region. However, the generation of functional words, such as "on", "of", have not been investigated. In this paper, a dual temporal modal is first proposed for image captioning to address the role of visual information on every time step. Based on the dual temporal modal, word-conditional semantic attention is also proposed to solve the problem of functional words generation. Finally, a balance strategy is adopted on the basis of the attention variation to make a trade off between visual attention and word-conditional semantic attention. Extensive experiments are conducted on Flicicr30k and COCO dataset to validate the effectiveness of the proposed method.
机译:广泛的努力已经专注于基于注意的图像标题框架,当生成的单词具有与图像区域的显式对应时,这已经实现了良好的性能。 然而,尚未调查“在”,“ON”中的功能词的产生。 在本文中,首先提出用于图像标题的双时间模态,以解决每次步骤的视觉信息的作用。 基于双颞模态,还提出了单词条件语义关注来解决功能词生成的问题。 最后,在注意力变化的基础上采用了平衡策略,以在视觉关注和单词条件的语义关注之间进行折衷。 广泛的实验是在Flicicr30K和Coco DataSet上进行的,以验证该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号