Towards gaze-based video annotation

机译：走向基于注视的视频注释

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.

机译：本文介绍了我们为使用注视进行视频注释的框架所做的努力。在计算机视觉中，视频注释（VA）是为评估对象检测和跟踪技术提供基本事实的必不可少的步骤。 VA是视频处理算法开发中的一个苛刻元素，在该算法中，每个感兴趣的对象都应手动标记。尽管社区已经使用VA了很长时间，但是新数据集的大小和新任务的复杂性迫使我们重新审视它。自动化视频注释的一个障碍是识别感兴趣的对象并在图像序列上对其进行跟踪。为了解决这个问题，我们采用视觉注意力的概念来增强视频注释。在图像中，人类的注意力自然地抓住了有趣的区域，这些区域为提取感兴趣的对象提供了有价值的信息，这些信息可用于注释视频。在基于任务的凝视记录下，我们利用观察者的凝视来过滤视频序列中的种子对象检测器响应。然后将已过滤的框传递给基于外观的跟踪算法。我们通过将算法与凝视和不凝视进行比较来评估凝视的有用性。我们表明，注视是增强自动视频注释，显着改善注释的有力提示。

著录项

来源
《International Conference on Image Processing Theory, Tools and Applications》|2016年|1-5|共5页
会议地点
作者
Mohamed Soliman; Hamed R. Tavakoli; Jorma Laaksonen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Detectors; Observers; Face; Object detection; Manuals; Motion pictures; Pipelines;

机译：探测器;观察员;面部;物体检测;手册;动态图像;管道;

相似文献

外文文献
中文文献
专利

1. QoE-aware gaze-based Bit Allocation for Networked Video Encoding [J] . Yunlong FENG, Gene Cheung, Yusheng JI 電子情報通信学会技術研究報告 . 2013,第476期

机译：基于QoE感知的基于注视的网络视频编码比特分配
2. QoE-aware gaze-based Bit Allocation for Networked Video Encoding [J] . Yunlong FENG, Gene Cheung, Yusheng JI 電子情報通信学会技術研究報告. コミュニケ-ションクオリティ. Communication Quality . 2012,第476期

机译：基于QoE感知的基于注视的网络视频编码比特分配
3. Recognizing key segments of videos for video annotation by learning from web image sets [J] . Song Hao, Wu Xinxiao, Liang Wei, Multimedia Tools and Applications . 2017,第5期

机译：通过从Web图像集中学习识别视频的关键片段以进行视频注释
4. Expected Exponential Loss for Gaze-Based Video and Volume Ground Truth Annotation [C] . Laurent Lejeune, Mario Christoudias, Raphael Sznitman Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis . 2017

机译：基于注视的视频和体积地面真相注释的预期指数损失
5. Near-future Prediction in Videos: Applications in Video Annotation and Frame Reconstruction [D] . ?Mahmud, Tahmida B. 2019

机译：视频近期预测：视频注释和帧重建中的应用
6. Manual Annotation of Colonoscopy Videos: A First Step towards Automation [O] . Piet C. de Groen, Wallapak Tavanapong, JungHwan Oh, 2006

机译：结肠镜检查视频的手动注释：自动化的第一步
7. Expected exponential loss for gaze-based video and volume ground truth annotation [O] . Lejeune, Laurent, Christoudias, Mario, Sznitman, Raphael 2017

机译：基于凝视的视频和音量基础事实的预期指数损失注解

Towards gaze-based video annotation

摘要

著录项

相似文献

相关主题

期刊订阅