Query-aware video encoder for video moment retrieval

Hao Jiachang; Sun Haifeng; Ren PengfeiWang JingyuQi QiLiao Jianxin

首页> 外文期刊>Neurocomputing >Query-aware video encoder for video moment retrieval

【24h】

Query-aware video encoder for video moment retrieval

机译：Query-aware video encoder for video moment retrieval

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Given an untrimmed video and a sentence query, video moment retrieval is to locate a target video moment that semantically corresponds to the query. It is a challenging task that requires a joint understanding of natural language queries and video contents. However, video contains complex contents, including query-related and query-irrelevant contents, which brings difficulty for the joint understanding. To this end, we propose a query-aware video encoder to capture the query-related visual contents. Specifically, we design a query-guided block following each encoder layer to recalibrate the encoded visual features according to the query semantics. The core of query-guided block is a channel-level attention gating mechanism, which could selectively emphasize query-related visual contents and suppress query-irrelevant ones. Besides, to fully match with different levels of contents in videos, we learn hierarchical and structural query clues to guide the visual content capturing. We disentangle sentence query into a semantics graph and capture the local contexts inside the graph via a trilinear model as query clues. Extensive experiments on Charades-STA and TACoS datasets demonstrate the effectiveness of our approach, and we achieve the state-of-the-art on the two datasets. (c) 2022 Elsevier B.V. All rights reserved.

著录项

来源
《Neurocomputing》 |2022年第28期|72-86|共15页
作者
Hao Jiachang; Sun Haifeng; Ren PengfeiWang JingyuQi QiLiao Jianxin;
展开▼
作者单位

Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种英语
中图分类
关键词
Video moment retrieval; Temporal sentence grounding; Video and language;

Query-aware video encoder for video moment retrieval

摘要

著录项

相关主题

期刊订阅