首页> 外文会议>International Joint Conference on Artificial Intelligence >EntScene: Nonparametric Bayesian Temporal Segmentation of Videos Aimed at Entity-Driven Scene Detection
【24h】

EntScene: Nonparametric Bayesian Temporal Segmentation of Videos Aimed at Entity-Driven Scene Detection

机译:entscene:瞄准实体驱动场景检测的视频的非参数贝叶斯时间分割

获取原文

摘要

In this paper, we study Bayesian techniques for entity discovery and temporal segmentation of videos. Existing temporal video segmentation techniques are based on low-level features, and are usually suitable for discovering short, homogeneous shots rather than diverse scenes, each of which contains several such shots. We define scenes in terms of semantic entities (eg. persons). This is the first attempt at entity-driven scene discovery in videos, without using meta-data like scripts. The problem is hard because we have no explicit prior information about the entities and the scenes. However such sequential data exhibit temporal coherence in multiple ways, and this provides implicit cues. To capture these, we propose a Bayesian generative model- EntScene, that represents entities with mixture components and scenes with discrete distributions over these components. The most challenging part of this approach is the inference, as it involves complex interactions of latent variables. To this end, we propose an algorithm based on Dynamic Blocked Gibbs Sampling, that attempts to jointly learn the components and the segmentation, by progressively merging an initial set of short segments. The proposed algorithm compares favourably against suitably designed baselines on several TV-series videos. We extend the method to an unexplored problem: temporal co-segmentation of videos containing same entities.
机译:在本文中,我们研究了贝叶斯的实体发现和视频分割的技术。现有的时间视频分割技术基于低电平特征,通常适用于发现短,均匀的射击而不是多样化的场景,每个镜头包含几个这样的镜头。我们在语义实体(例如,人物)方面定义场景。这是第一次尝试在视频中的实体驱动场景发现,而不使用脚本等元数据。问题很难,因为我们没有有关实体和场景的明确事先信息。然而,这种顺序数据以多种方式表现出时间相干性,这提供了隐含的提示。为了捕获这些,我们提出了一个贝叶斯生成模型,它代表了具有在这些组件上具有离散分布的混合组件和场景的实体。这种方法最具挑战性的部分是推理,因为它涉及潜在变量的复杂相互作用。为此,我们提出了一种基于动态阻塞GIBBS采样的算法,其通过逐步合并初始的短段集合来共同学习组件和分割。所提出的算法对几个电视系列视频的适当设计的基线有利地进行了比较。我们将方法扩展到未开发的问题:包含相同实体的视频的时间共分割。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号