首页> 外文学位 >Beyond dynamic textures: A family of stochastic dynamical models for video with applications to computer vision.
【24h】

Beyond dynamic textures: A family of stochastic dynamical models for video with applications to computer vision.

机译:超越动态纹理:视频的随机动态模型系列,并应用于计算机视觉。

获取原文
获取原文并翻译 | 示例

摘要

One family of visual processes that has relevance for various applications of computer vision is that of, what could be loosely described as, visual processes composed of ensembles of particles subject to stochastic motion. The particles can be microscopic (e.g. plumes of smoke), macroscopic (e.g. leaves blowing in the wind), or even objects (e.g. a human crowd or a traffic jam). The applications range from remote monitoring for the prevention of natural disasters (e.g. forest fires), to background subtraction in challenging environments (e.g. outdoor scenes with moving trees in the background), and to surveillance (e.g. traffic monitoring). Despite their practical significance, the visual processes in this family still pose tremendous challenges for computer vision. In particular, the stochastic nature of the motion fields tends to be highly challenging for traditional motion representations such as optical flow, parametric motion models, and object tracking. Recent efforts have advanced towards modeling video motion probabilistically, by viewing video sequences as "dynamic textures'' or, more precisely, samples from a generative, stochastic, texture model defined over space and time. Despite its successes in applications such as video synthesis, motion segmentation, and video classification, the dynamic texture model has several major limitations, such as an inability to account for visual processes consisting of multiple co-occurring textures (e.g. smoke rising from a fire), and an inability to model complex motion (e.g. panning camera motion).;We propose a family of dynamical models for video that address the limitations of the dynamic texture, and apply these new models to challenging computer vision problems. In particular, we introduce two multi-modal models for video, the mixture of dynamic textures and the layered dynamic texture, which provide principled frameworks for video clustering and motion segmentation. We also propose a non-linear model, the kernel dynamic texture, which can capture complex patterns of motion through a non-linear manifold embedding. We present a new framework for the classification of dynamic textures, which combines the modeling power of the dynamic texture and the generalization guarantees, for classification, of the support vector machine classifier, by deriving a new probabilistic kernel based on the Kullback-Leibler divergence between dynamic textures. Finally, we demonstrate the applicability of these models to a wide variety of real-world computer vision problems, including motion segmentation, video clustering, video texture classification, highway traffic monitoring, crowd counting, and adaptive background subtraction. We also demonstrate that the dynamic texture is a suitable representation for musical signals, by applying the proposed models to the computer audition task of song segmentation. These successes validate the dynamic texture framework as a principled approach for representing video, and suggest that the models could be useful in other domains, such as computer audition, that require the analysis of time-series data.
机译:与计算机视觉的各种应用相关的一个视觉过程家族是可以随机描述的由一系列随机运动的粒子组成的视觉过程。颗粒可以是微观的(例如烟气),宏观的(例如在风中吹拂的叶子),甚至是物体(例如人群或交通拥堵)。其应用范围包括从预防自然灾害(例如森林火灾)的远程监视,到具有挑战性的环境中的背景消减(例如,背景中树木在移动的户外场景)以及监视(例如交通监控)。尽管它们具有实际意义,但是该家族中的视觉过程仍然对计算机视觉提出了巨大的挑战。特别地,对于诸如光流,参数化运动模型和对象跟踪之类的传统运动表示,运动场的随机性趋向于极具挑战性。通过将视频序列视为“动态纹理”,或更准确地说,是将来自随时间和空间定义的生成,随机,纹理模型的样本视为视频运动,最近的工作已朝着概率建模的方向发展,尽管该方法在视频合成等应用中取得了成功,在运动分割和视频分类中,动态纹理模型具有几个主要限制,例如无法考虑由多个同时出现的纹理(例如,从火中冒出的烟雾)组成的视觉过程,并且无法建模复杂的运动(例如,我们提出了一个视频动态模型系列,以解决动态纹理的局限性,并将这些新模型应用于挑战性计算机视觉问题,特别是,我们引入了两种视频多模式模型,即混合模式动态纹理和分层动态纹理的组合,为视频聚类和运动分割提供了有原则的框架。 -线性模型,内核动态纹理,可以通过非线性流形嵌入捕获复杂的运动模式。我们提出了一个新的动态纹理分类框架,该框架结合了动态纹理的建模能力和支持向量机分类器分类的泛化保证,并通过基于Kullback-Leibler散度的新概率概率来进行分类。动态纹理。最后,我们演示了这些模型对各种实际计算机视觉问题的适用性,包括运动分割,视频聚类,视频纹理分类,高速公路交通监控,人群计数和自适应背景减法。通过将提出的模型应用于歌曲分割的计算机试听任务,我们还证明了动态纹理是音乐信号的合适表示。这些成功验证了动态纹理框架是一种用于表示视频的有原则的方法,并表明该模型在需要分析时序数据的其他领域(例如计算机试听)中可能有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号