首页> 外文学位 >Bayesian data association for temporal scene understanding.
【24h】

Bayesian data association for temporal scene understanding.

机译:贝叶斯数据关联用于时域场景理解。

获取原文
获取原文并翻译 | 示例

摘要

Understanding the content of a video sequence is not a particularly difficult problem for humans. We can easily identify objects, such as people, and track their position and pose within the 3D world. A computer system that could understand the world through videos would be extremely beneficial in applications such as surveillance, robotics, biology. Despite significant advances in areas like tracking and, more recently, 3D static scene understanding, such a vision system does not yet exist. In this work, I present progress on this problem, restricted to videos of objects that move in smoothly and which are relatively easily detected, such as people. Our goal is to identify all the moving objects in the scene and track their their physical state (e.g., their 3D position or pose) in the world throughout the video.;We develop a Bayesian generative model of a temporal scene, where we separately model data association, the 3D scene and imaging system, and the likelihood function. Under this model, the video data is the result of capturing the scene with the imaging system, and noisily detecting video features. This formulation is very general, and can be used to model a wide variety of scenarios, including videos of people walking, and time-lapse images of pollen tubes growing in vitro. Importantly, we model the scene in world coordinates and units, as opposed to pixels, allowing us to reason about the world in a natural way, e.g., explaining occlusion and perspective distortion. We use Gaussian processes to model motion, and propose that it is a general and effective way to characterize smooth, but otherwise arbitrary, trajectories.;We perform inference using MCMC sampling, where we fit our model of the temporal scene to data extracted from the videos. We address the problem of variable dimensionality by estimating data association and integrating out all scene variables. Our experiments show our approach is competitive, producing results which are comparable to state-of-the-art methods.
机译:对于人类来说,了解视频序列的内容并不是特别困难的问题。我们可以轻松识别诸如人之类的对象,并跟踪它们在3D世界中的位置和姿势。可以通过视频了解世界的计算机系统在监视,机器人技术和生物学等应用中将极为有益。尽管在跟踪和最近的3D静态场景理解等领域取得了重大进展,但这种视觉系统尚不存在。在这项工作中,我介绍了此问题的进展,仅限于平稳移动且相对容易检测到的物体(例如人)的视频。我们的目标是识别场景中的所有移动对象,并在整个视频中跟踪它们在世界上的物理状态(例如,它们的3D位置或姿势)。;我们开发了时间场景的贝叶斯生成模型,在此我们分别建模数据关联,3D场景和成像系统以及似然函数。在此模型下,视频数据是使用成像系统捕获场景并嘈杂地检测视频特征的结果。此公式非常笼统,可用于建模各种场景,包括人们行走的视频以及体外生长的花粉管的延时图像。重要的是,我们使用世界坐标和单位(而不是像素)对场景进行建模,从而使我们能够以自然的方式推理世界,例如,解释遮挡和透视变形。我们使用高斯过程对运动进行建模,并提出这是表征平滑但任意的轨迹的通用有效方法。;我们使用MCMC采样进行推理,在此过程中,我们的时间场景模型适合于从场景提取的数据视频。我们通过估计数据关联并整合所有场景变量来解决变量维数问题。我们的实验表明,我们的方法具有竞争力,其结果可与最新方法媲美。

著录项

  • 作者

    Brau Avila, Ernesto.;

  • 作者单位

    The University of Arizona.;

  • 授予单位 The University of Arizona.;
  • 学科 Computer Science.;Artificial Intelligence.;Statistics.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 114 p.
  • 总页数 114
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号