...
首页> 外文期刊>Journal of visual communication & image representation >Local relation network with multilevel attention for visual question answering
【24h】

Local relation network with multilevel attention for visual question answering

机译:本地关系网络具有多级关注视觉问题的回答

获取原文
获取原文并翻译 | 示例

摘要

With the tremendous success of the visual question answering (VQA) tasks, visual attention mechanisms have become an indispensable part of VQA models. However, these attention-based methods do not consider any relationship among regions, which is crucial for the thorough understanding of the image by the model. We propose local relation networks for generating context-aware image features for each image region, which contain information on the relationship among the other image regions. Furthermore, we propose a multilevel attention mechanism to combine semantic information from the LRNs and the original image regions, rendering the decision of the model more reasonable. With these two measures, we improve the region representation and achieve better attentive effect and VQA performance. We conduct numerous experiments on the COCO-QA dataset and the largest VQA v2.0 benchmark dataset. Our model achieves competitive results, proving the effectiveness of our proposed LRNs and multilevel attention mechanism through visual demonstrations. (C) 2020 Published by Elsevier Inc.
机译:随着视觉问题的巨大成功(VQA)任务,视觉注意力机制已成为VQA模型的不可或缺的一部分。然而,这些关注的方法不考虑区域之间的任何关系,这对于模型对图像的彻底了解至关重要。我们提出了用于为每个图像区域生成上下文感知图像特征的本地关系网络,其包含关于其他图像区域之间的关系的信息。此外,我们提出了一种多级注意机制来组合来自LRN和原始图像区域的语义信息,呈现模型的决定更合理。通过这两项措施,我们改善了该地区的代表性,实现了更好的细节效果和VQA性能。我们在Coco-QA DataSet和最大的VQA V2.0基准数据集中进行众多实验。我们的模式实现了竞争力的结果,通过视觉演示证明了我们提出的LRN和多级注意机制的有效性。 (c)2020由elsevier公司发布

著录项

  • 来源
    《Journal of visual communication & image representation 》 |2020年第11期| 102762.1-102762.9| 共9页
  • 作者单位

    Beijing Normal Univ Intelligent Comp & Software Res Ctr Sch Artificial Intelligence Beijing Peoples R China;

    Beijing Normal Univ Intelligent Comp & Software Res Ctr Sch Artificial Intelligence Beijing Peoples R China;

    Beijing Normal Univ Intelligent Comp & Software Res Ctr Sch Artificial Intelligence Beijing Peoples R China;

    Beijing Normal Univ Intelligent Comp & Software Res Ctr Sch Artificial Intelligence Beijing Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Visual question answering; Relation network; Attention mechanism;

    机译:视觉问题应答;关系网络;注意机制;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号