首页> 外文会议>IEEE Winter Conference on Applications of Computer Vision >DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video
【24h】

DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video

机译:Dori:发现视频中自然语言查询的时刻本地化的对象关系

获取原文

摘要

This paper studies the task of temporal moment localization in long untrimmed videos using natural language queries. Given a query sentence, the goal is to determine the start and end of the relevant segment within the video. Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm suitable for temporal moment localization which captures the relationships between humans, objects and activities in the video. These relationships are obtained by a spatial sub-graph that contextualizes the scene representation using detected objects and human features conditioned in the language query. Moreover, a temporal sub-graph captures the activities within the video through time. Our method is evaluated on three standard benchmark datasets, and we also introduce YouCookII as a new benchmark for this task. Experiments show our method outperforms state-of-the-art methods on these datasets, confirming the effectiveness of our approach.
机译:本文研究了使用自然语言查询的长期未经监测视频中的时间时刻定位的任务。 给定查询句子,目标是确定视频中相关段的开始和结束。 我们的关键创新是通过适用于时间片刻定位的语言调节消息传递算法来学习嵌入的视频功能,该算法捕获了视频中人物,对象和活动之间的关系。 这些关系是通过在语言查询中调节的检测到的对象和人类特征来上下文化场景表示的空间子图来获得这些关系。 此外,时间子图通过时间捕获视频内的活动。 我们的方法在三个标准基准数据集中进行评估,我们还将YouScookii介绍为此任务的新基准。 实验表明我们的方法优于这些数据集的最先进的方法,确认了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号