...
首页> 外文期刊>Circuits and Systems for Video Technology, IEEE Transactions on >Video Tomographs and a Base Detector Selection Strategy for Improving Large-Scale Video Concept Detection
【24h】

Video Tomographs and a Base Detector Selection Strategy for Improving Large-Scale Video Concept Detection

机译:视频断层扫描仪和基础探测器选择策略,用于改善大规模视频概念检测

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we deal with the problem of video concept detection to use the concept detection results toward a more effective concept-based video retrieval. The key novelties of this paper are as follows: 1) the use of spatio-temporal video slices (tomographs) in the same way that visual keyframes are typically used in video concept detection schemes. These spatio-temporal slices capture in a compact way motion patterns that are useful for detecting semantic concepts and are used for training a number of base detectors. The latter augment the set of keyframe-based base detectors that can be trained using different frame representations. 2) The introduction of a generic methodology, built upon a genetic algorithm, for controlling which subset of the available base detectors (consequently, which subset of the possible shot representations) should be combined for developing an optimal detector for each specific concept. This methodology is directly applicable to the learning of hundreds of diverse concepts, while diverging from the one-size-fits-all approach that is typically used in problems of this size. The proposed techniques are evaluated on the datasets of the 2011 and 2012 Semantic Indexing Task of TRECVID, each comprising several hundred hours of heterogeneous video clips and ground-truth annotations for tens of concepts that exhibit significant variation in terms of generality, complexity, and human participation. The experimental results manifest the merit of the proposed techniques.
机译:在本文中,我们处理视频概念检测的问题,以将概念检测结果用于更有效的基于概念的视频检索。本文的主要新颖之处如下:1)时空视频切片(断层图)的使用与视频概念检测方案中通常使用视觉关键帧的方式相同。这些时空切片以紧凑的方式捕获了运动模式,这些运动模式对于检测语义概念很有用,并用于训练许多基本检测器。后者增强了基于关键帧的基础检测器的集合,可以使用不同的帧表示对其进行训练。 2)引入了一种基于遗传算法的通用方法,用于控制可用基本检测器的哪个子集(因此,可能的镜头表示的哪个子集)应结合起来,以便为每个特定概念开发最佳检测器。这种方法直接适用于数百种不同概念的学习,而不同于通常用于这种规模问题的“一刀切”的方法。在2011年和2012年TRECVID语义索引任务的数据集上对提出的技术进行了评估,每个数据集包含数百小时的异类视频剪辑和真实的注释,这些概念的数十个概念在通用性,复杂性和人性方面均表现出很大差异参与。实验结果证明了所提出技术的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号