...
首页> 外文期刊>Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing >A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps
【24h】

A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps

机译:由聚合显着性图驱动的用于优化2D / 3D关键帧提取的通用框架

获取原文
获取原文并翻译 | 示例
           

摘要

This paper proposes a generic framework for extraction of key-frames from 2D or 3D video sequences, relying on a new method to compute 3D visual saliency. The framework comprises the following novel aspects that distinguish this work from previous ones: (i) the key-frame selection process is driven by an aggregated saliency map, computed from various feature maps, which in turn correspond to different visual attention models; (ii) a method for computing aggregated saliency maps in 3D video is proposed and validated using fixation density maps, obtained from ground-truth eye-tracking data; (iii) 3D video content is processed within the same framework as 2D video, by including a depth feature map into the aggregated saliency. A dynamic programming optimisation algorithm is used to find the best set of K frames that minimises the dissimilarity error (i.e., maximise similarity) between the original video shots of size N > K and those reconstructed from the key-frames. Using different performance metrics and publicly available data-bases, the simulation results demonstrate that the proposed framework outperforms similar state-of-art methods and achieves comparable performance as other quite different approaches. Overall, the proposed framework is validated for a wide range of visual content and has the advantage of being independent from any specific visual saliency model or similarity metrics. (C) 2015 Elsevier B.V. All rights reserved.
机译:本文提出了一种从2D或3D视频序列中提取关键帧的通用框架,它依靠一种新的方法来计算3D视觉显着性。该框架包含以下新颖方面,使这项工作与以前的工作有所区别:(i)关键帧选择过程由聚合的显着性图驱动,该显着性图是从各种特征图计算得出的,特征图又对应于不同的视觉注意模型; (ii)提出了一种计算3D视频中聚合显着性图的方法,并使用了从地面真眼跟踪数据中获得的注视密度图进行了验证; (iii)通过将深度特征图包含到聚合显着性中,在与2D视频相同的框架内处理3D视频内容。使用动态编程优化算法来找到最佳的K个帧集,以最大程度地减小大小N> K的原始视频镜头与从关键帧重构的视频镜头之间的相异误差(即,最大化相似度)。通过使用不同的性能指标和可公开获得的数据库,仿真结果表明,所提出的框架的性能优于类似的最新方法,并且可实现与其他完全不同的方法相当的性能。总体而言,所提出的框架已针对广泛的视觉内容进行了验证,并且具有独立于任何特定视觉显着性模型或相似性指标的优势。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号