首页> 外文期刊>Multimedia Tools and Applications >Remember and forget: video and text fusion for video question answering
【24h】

Remember and forget: video and text fusion for video question answering

机译:记住和忘记:视频和文本融合,用于视频问答

获取原文
获取原文并翻译 | 示例
           

摘要

Video question answering (Video QA) has received much attention in recent years. It can answer questions according to the visual content of a video clip. Video QA task can be solved only according to the video data. But if the video clip has some relevant text information, It can also be solved by using the fused video and text data. How to select the useful region features from the video frames and select the useful text features from the text information needs to be solved. And how to fuse the video and text features also needs to be solved. Therefore, we propose a forget memory network to solve these problems. The forget memory network with video framework can solve Video QA task only according to the video data. It can select the useful region features for the question and forget the irrelevant region features from the video frames. The forget memory network with video and text framework can extract the useful text features and forget the irrelevant text features for the question. And it can fuse the video and text data to solve Video QA task. The fused video and text features can help improve the experimental performance.
机译:近年来,视频问答(视频质量检查)受到了广泛关注。它可以根据视频剪辑的视觉内容回答问题。视频质量检查任务只能根据视频数据来解决。但是,如果视频剪辑具有一些相关的文本信息,也可以通过使用融合的视频和文本数据来解决。需要解决如何从视频帧中选择有用的区域特征以及如何从文本信息中选择有用的文本特征。以及如何融合视频和文本功能也需要解决。因此,我们提出了一个忘记存储网络来解决这些问题。带视频框架的忘记存储网络只能根据视频数据来解决视频质量检查任务。它可以为问题选择有用的区域特征,并从视频帧中忽略不相关的区域特征。具有视频和文本框架的“忘记记忆网络”可以提取有用的文本特征,并忘记与问题无关的文本特征。它可以融合视频和文本数据来解决视频质量检查任务。融合的视频和文本功能可以帮助改善实验性能。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2018年第22期|29269-29282|共14页
  • 作者单位

    School of Computer and Information Engineering, Anyang Normal University,Henan Key Laboratory of Oracle Bone Inscriptions Information Processing, Anyang Normal University,Collaborative Innovation Center of International Dissemination of Chinese Language Henan Province (HNIDCL);

    School of Computer Science and Technology, Tianjin University;

    School of Computer and Information Engineering, Anyang Normal University,Henan Key Laboratory of Oracle Bone Inscriptions Information Processing, Anyang Normal University,Collaborative Innovation Center of International Dissemination of Chinese Language Henan Province (HNIDCL);

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Video QA; Forget memory network; Fused video and text features;

    机译:视频质量检查;忘记存储网络;融合的视频和文字功能;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号