Remember and forget: video and text fusion for video question answering

Feng Gao; Yuanyuan Ge; Yongge Liu

首页> 外文期刊>Multimedia Tools and Applications >Remember and forget: video and text fusion for video question answering

【24h】

Remember and forget: video and text fusion for video question answering

机译：记住和忘记：视频和文本融合，用于视频问答

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video question answering (Video QA) has received much attention in recent years. It can answer questions according to the visual content of a video clip. Video QA task can be solved only according to the video data. But if the video clip has some relevant text information, It can also be solved by using the fused video and text data. How to select the useful region features from the video frames and select the useful text features from the text information needs to be solved. And how to fuse the video and text features also needs to be solved. Therefore, we propose a forget memory network to solve these problems. The forget memory network with video framework can solve Video QA task only according to the video data. It can select the useful region features for the question and forget the irrelevant region features from the video frames. The forget memory network with video and text framework can extract the useful text features and forget the irrelevant text features for the question. And it can fuse the video and text data to solve Video QA task. The fused video and text features can help improve the experimental performance.

机译：近年来，视频问答（视频质量检查）受到了广泛关注。它可以根据视频剪辑的视觉内容回答问题。视频质量检查任务只能根据视频数据来解决。但是，如果视频剪辑具有一些相关的文本信息，也可以通过使用融合的视频和文本数据来解决。需要解决如何从视频帧中选择有用的区域特征以及如何从文本信息中选择有用的文本特征。以及如何融合视频和文本功能也需要解决。因此，我们提出了一个忘记存储网络来解决这些问题。带视频框架的忘记存储网络只能根据视频数据来解决视频质量检查任务。它可以为问题选择有用的区域特征，并从视频帧中忽略不相关的区域特征。具有视频和文本框架的“忘记记忆网络”可以提取有用的文本特征，并忘记与问题无关的文本特征。它可以融合视频和文本数据来解决视频质量检查任务。融合的视频和文本功能可以帮助改善实验性能。

著录项

来源
《Multimedia Tools and Applications》 |2018年第22期|29269-29282|共14页
作者
Feng Gao; Yuanyuan Ge; Yongge Liu;
展开▼
作者单位

School of Computer and Information Engineering, Anyang Normal University,Henan Key Laboratory of Oracle Bone Inscriptions Information Processing, Anyang Normal University,Collaborative Innovation Center of International Dissemination of Chinese Language Henan Province (HNIDCL);

School of Computer Science and Technology, Tianjin University;

School of Computer and Information Engineering, Anyang Normal University,Henan Key Laboratory of Oracle Bone Inscriptions Information Processing, Anyang Normal University,Collaborative Innovation Center of International Dissemination of Chinese Language Henan Province (HNIDCL);

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Video QA; Forget memory network; Fused video and text features;

机译：视频质量检查;忘记存储网络;融合的视频和文字功能;

相似文献

外文文献
中文文献
专利

1. The forgettable-watcher model for video question answering [J] . Chu Wenqing, Xue Hongyang, Zhao Zhou, Neurocomputing . 2018,第NOVa7期

机译：视频问答的勿忘观察者模型
2. Unifying the Video and Question Attentions for Open-Ended Video Question Answering [J] . Hongyang Xue, Zhou Zhao, Deng Cai IEEE Transactions on Image Processing . 2017,第12期

机译：统一开放式视频问答的视频和问题注意
3. Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering [J] . Shaoning Xiao, Yimeng Li, Yunan Ye, Neural processing letters . 2020,第2期

机译：视频问题回答的多粒子关注特征的分层时间融合
4. A Joint Sequence Fusion Model for Video Question Answering and Retrieval [C] . Youngjae Yu, Jongseok Kim, Gunhee Kim European conference on computer vision . 2018

机译：用于视频问答和检索的联合序列融合模型
5. Fusion multisensorielle dans le spectre du visible et de l'infrarouge: Amelioration du suivi de pietons dans des sequences videos (French text). [D] . Torresan, Helene. 2005

机译：可见光谱和红外光谱中的多传感器融合：改进视频序列中的行人跟踪（法文）。
6. Correction: Low Message Sensation Health Promotion Videos Are Better Remembered and Activate Areas of the Brain Associated with Memory Encoding [O] . -1

机译：更正：更好地记住低信息感健康促进视频并激活与记忆编码相关的大脑区域
7. Building a New Library: forget the answers, Remember the Questions [O] . Τσιμπόγλου Φίλιππος Χ. 2012

机译：建立新图书馆：记住答案，记住问题
8. First Steps Toward Linking Dialogues: Mediating Between Free-text Questions and Pre-recorded Video Answers [R] . Gandhe, S. , Gordon, A. , Leuski, A. , 2004

机译：连接对话的第一步：在自由文本问题和预先录制的视频答案之间进行调解

Remember and forget: video and text fusion for video question answering

摘要

著录项

相似文献

相关主题

期刊订阅