DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video

机译：Dori：发现视频中自然语言查询的时刻本地化的对象关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper studies the task of temporal moment localization in long untrimmed videos using natural language queries. Given a query sentence, the goal is to determine the start and end of the relevant segment within the video. Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm suitable for temporal moment localization which captures the relationships between humans, objects and activities in the video. These relationships are obtained by a spatial sub-graph that contextualizes the scene representation using detected objects and human features conditioned in the language query. Moreover, a temporal sub-graph captures the activities within the video through time. Our method is evaluated on three standard benchmark datasets, and we also introduce YouCookII as a new benchmark for this task. Experiments show our method outperforms state-of-the-art methods on these datasets, confirming the effectiveness of our approach.

机译：本文研究了使用自然语言查询的长期未经监测视频中的时间时刻定位的任务。给定查询句子，目标是确定视频中相关段的开始和结束。我们的关键创新是通过适用于时间片刻定位的语言调节消息传递算法来学习嵌入的视频功能，该算法捕获了视频中人物，对象和活动之间的关系。这些关系是通过在语言查询中调节的检测到的对象和人类特征来上下文化场景表示的空间子图来获得这些关系。此外，时间子图通过时间捕获视频内的活动。我们的方法在三个标准基准数据集中进行评估，我们还将YouScookii介绍为此任务的新基准。实验表明我们的方法优于这些数据集的最先进的方法，确认了我们方法的有效性。

著录项

来源
《IEEE Winter Conference on Applications of Computer Vision》|2021年|1078-1087|共10页
会议地点
作者
Cristian Rodriguez-Opazo; Edison Marrese-Taylor; Basura Fernando; Hongdong Li; Stephen Gould;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Location awareness; Technological innovation; Computer vision; Conferences; Natural languages; Benchmark testing; Feature extraction;

机译：地点意识;技术创新;计算机愿景;会议;自然语言;基准测试;特征提取;

相似文献

外文文献
中文文献
专利

1. MRN: Moment Relation Network for Natural Language Video Localization with Transfer Learning [J] . Jiang Siyu, Wu Guobin International Journal of Pattern Recognition and Artificial Intelligence . 2021,第7期

机译：MRN：用于自然语言视频本地化的时刻关系网络与转移学习
2. Querying video intervals by spatio-temporal relationships of moving object traces [J] . Chikashi Yajima, Yoshihiro Nakanishi, Katsumi Tanaka 電子情報通信学会技術研究報告. デ-タ工学. Data Engineering . 2001,第193期

机译：通过运动对象轨迹的时空关系查询视频间隔
3. Querying video intervals by spatio-temporal relationships of moving object traces [J] . Chikashi Yajima, Yoshihiro Nakanishi, Katsumi Tanaka 電子情報通信学会技術研究報告. デ-タ工学. Data Engineering . 2001,第193期

机译：通过移动物体迹线的时空关系查询视频间隔
4. Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention [C] . Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Fatemeh Sadat Saleh, IEEE Winter Conference on Applications of Computer Vision . 2020

机译：使用引导注意的视频中自然语言查询的无提议时间矩本地化
5. Natural-language representation and query of linear geographic objects in GIS. [D] . Xu, Jun. 2005

机译：GIS中线性地理对象的自然语言表示和查询。
6. GELLO: An Object-Oriented Query and Expression Language for Clinical Decision Support [O] . Margarita Sordo, Omolola Ogunyemi, Aziz A. Boxwala, 2003

机译：GELLO：一种用于临床决策支持的面向对象的查询和表达语言
7. Retrieving videogame moments with natural language queries [O] . Xiaoxuan Zhang, Adam M. Smith 2019

机译：检索具有自然语言查询的视频游戏矩

DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video

摘要

著录项

相似文献

相关主题

期刊订阅