首页> 外文期刊>Image and Vision Computing >Interpretable visual reasoning: A survey
【24h】

Interpretable visual reasoning: A survey

机译:可解释的视觉推理:调查

获取原文
获取原文并翻译 | 示例
           

摘要

Visual reasoning refers to the process of solving questions about visual information. At present, most visual reasoning models are mainly based on deep learning and end-to-end architecture. Although these models have achieved good performance, they are usually black boxes for users, and it is difficult to understand the basic rationales of the reasoning process. In recent years, the academic community has realized the importance of interpretability in visual reasoning and has developed a series of Interpretable Visual Reasoning (IVR) models. In this paper, we review these models. First, we have established a taxonomy based on four explanation forms of vision, text, graph and symbol used in current visual reasoning. Secondly, we explore the typical IVR models of each category and analyze their pros and cons. Thirdly, we elaborate on the current mainstream datasets about visual reasoning and VQA, and analyze how these datasets promote IVR research from different perspectives. Finally, we summarize the challenges for IVR and point out potential research directions. (c) 2021 Elsevier B.V. All rights reserved.
机译:视觉推理是指解决关于视觉信息的问题的过程。目前,大多数视觉推理模型主要基于深度学习和端到端架构。虽然这些模型取得了良好的性能,但它们通常是用户的黑匣子,很难理解推理过程的基本理由。近年来,学术界已经意识到可视推理中可解释性的重要性,并开发了一系列可解释的视觉推理(IVR)模型。在本文中,我们审查了这些模型。首先,我们已经建立了基于四个解释形式的视觉,文本,图形和符号的四个愿景,文本,图形和符号。其次,我们探索每个类别的典型IVR模型,并分析他们的利弊。第三,我们详细说明了关于视觉推理和VQA的当前主流数据集,并分析这些数据集如何从不同的角度推广IVR研究。最后,我们总结了IVR的挑战并指出潜在的研究方向。 (c)2021 elestvier b.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号