...
首页> 外文期刊>Advances in multimedia >Video Scene Detection Using Compact Bag of Visual Word Models
【24h】

Video Scene Detection Using Compact Bag of Visual Word Models

机译:使用紧凑型视觉单词模型的视频场景检测

获取原文
获取原文并翻译 | 示例
           

摘要

Video segmentation into shots is the first step for video indexing and searching. Videos shots are mostly very small in duration and do not give meaningful insight of the visual contents. However, grouping of shots based on similar visual contents gives a better understanding of the video scene; grouping of similar shots is known as scene boundary detection or video segmentation into scenes. In this paper, we propose a model for video segmentation into visual scenes using bag of visual word (BoVW) model. Initially, the video is divided into the shots which are later represented by a set of key frames. Key frames are further represented by BoVW feature vectors which are quite short and compact compared to classical BoVW model implementations. Two variations of BoVW model are used: (1) classical BoVW model and (2) Vector of Linearly Aggregated Descriptors (VLAD) which is an extension of classical BoVW model. The similarity of the shots is computed by the distances between their key frames feature vectors within the sliding window of length L, rather comparing each shot with very long lists of shots which has been previously practiced, and the value of L is 4. Experiments on cinematic and drama videos show the effectiveness of our proposed framework. The BoVW is 25000-dimensional vector and VLAD is only 2048-dimensional vector in the proposed model. The BoVW achieves 0.90 segmentation accuracy, whereas VLAD achieves 0.83.
机译:将视频分割成快照是视频索引和搜索的第一步。视频镜头的时长通常很小,并且无法提供有意义的视觉内容见解。但是,基于相似的视觉内容对镜头进行分组可以更好地理解视频场景;相似镜头的分组称为场景边界检测或视频分割成场景。在本文中,我们提出了一种使用视觉单词袋(BoVW)模型将视频分割成视觉场景的模型。最初,视频被划分为镜头,随后由一组关键帧表示。关键帧进一步由BoVW特征向量表示,与传统的BoVW模型实现相比,该向量非常短且紧凑。使用BoVW模型的两个变体:(1)经典BoVW模型和(2)线性聚合描述符(VLAD)向量,它是经典BoVW模型的扩展。镜头的相似性是通过长度L的滑动窗口内关键帧特征向量之间的距离来计算的,而不是将每个镜头与之前已经练习过的很长的镜头列表进行比较,L的值为4。电影和戏剧视频显示了我们提出的框架的有效性。在所提出的模型中,BoVW是25000维向量,而VLAD仅是2048维向量。 BoVW达到0.90的分割精度,而VLAD达到0.83。

著录项

  • 来源
    《Advances in multimedia》 |2018年第2018期|2564963.1-2564963.9|共9页
  • 作者单位

    Department of Computer Science & IT, University of Balochistan, Pakistan;

    Department of Computer Science & IT, University of Balochistan, Pakistan;

    Department of Computer Science & IT, University of Balochistan, Pakistan;

    Department of Computer Science, SukkurlBA University, Pakistan;

    Department of Computer Science & IT, University of Balochistan, Pakistan;

    Department of Computer Science, Sardar Bahadur Khan Women's University, Pakistan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号