【24h】

Towards gaze-based video annotation

机译:走向基于注视的视频注释

获取原文

摘要

This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.
机译:本文介绍了我们为使用注视进行视频注释的框架所做的努力。在计算机视觉中,视频注释(VA)是为评估对象检测和跟踪技术提供基本事实的必不可少的步骤。 VA是视频处理算法开发中的一个苛刻元素,在该算法中,每个感兴趣的对象都应手动标记。尽管社区已经使用VA了很长时间,但是新数据集的大小和新任务的复杂性迫使我们重新审视它。自动化视频注释的一个障碍是识别感兴趣的对象并在图像序列上对其进行跟踪。为了解决这个问题,我们采用视觉注意力的概念来增强视频注释。在图像中,人类的注意力自然地抓住了有趣的区域,这些区域为提取感兴趣的对象提供了有价值的信息,这些信息可用于注释视频。在基于任务的凝视记录下,我们利用观察者的凝视来过滤视频序列中的种子对象检测器响应。然后将已过滤的框传递给基于外观的跟踪算法。我们通过将算法与凝视和不凝视进行比较来评估凝视的有用性。我们表明,注视是增强自动视频注释,显着改善注释的有力提示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号