...
首页> 外文期刊>Neurocomputing >Affective question answering on video
【24h】

Affective question answering on video

机译:视频上的情感问答

获取原文
获取原文并翻译 | 示例
           

摘要

Visual Question Answering (VQA) is an increasingly popular research area in machine learning. Most of the existing VQA tasks only focus on static images, and only a few models are based on videos. The primary purpose of this project is to develop an innovative model that performs Affective Question Answering on Video (AQAV), a multi-tasking architecture that implements a Video QA route and an Affective route. A pre-trained CNN emotion detector recognizes emotions on the frames of a video, and a string of the emotion labels is relayed to the Token-based, Frame-based and Integrated attention mechanisms. The attention model uses the visual features, the question and the emotion labels to focus on relevant frames of the video and relevant regions of the frames. The string of emotion labels is used to generate an emotion caption that will be used by the Text QA module to prepare an affective answer. A conventional answer is generated from processes that take place along the Video QA route, while the affective answer is a product of both the Video QA and the Affective routes. Our model does not only make VQA more analytic by generating an explanatory answer, but also registers quantitative improvement in performance, when compared with previous baselines. We managed to prove that the injection of emotions in the attention mechanism boosts VQA performance. The AQAV model contributes towards efforts in making machines understand sequential and dynamic visual scenes in the real world. (C) 2019 Published by Elsevier B.V.
机译:视觉问答(VQA)是机器学习中越来越受欢迎的研究领域。现有的大多数VQA任务仅关注静态图像,只有很少的模型基于视频。该项目的主要目的是开发一种创新的模型,该模型执行视频情感问题解答(AQAV),该多任务体系结构实现了视频质量检查路线和情感路线。预先训练的CNN情感检测器可以识别视频帧上的情感,并将一串情感标签中继到基于令牌,基于帧和集成的注意力机制。注意力模型使用视觉特征,问题和情感标签来关注视频的相关帧和帧的相关区域。情感标签字符串用于生成情感标题,文本QA模块将使用该标题来准备情感回答。传统答案是从视频质量检查路线中发生的过程中生成的,而情感性回答是视频质量检查和情感路线的乘积。与以前的基准相比,我们的模型不仅可以通过生成解释性答案来使VQA更具分析性,而且还可以实现性能的定量提高。我们设法证明,在注意力机制中注入情绪可以提高VQA的表现。 AQAV模型有助于使机器了解现实世界中的顺序和动态视觉场景。 (C)2019由Elsevier B.V.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号