首页> 外文学位 >Computing three-dimensional scene from a single image by bottom-up/top-down Bayesian inference.
【24h】

Computing three-dimensional scene from a single image by bottom-up/top-down Bayesian inference.

机译:通过自下而上/自上而下的贝叶斯推理从单个图像计算三维场景。

获取原文
获取原文并翻译 | 示例

摘要

It is common experience for human vision to perceive full 3D shape and scene from a single 2D image with the occluded parts "filled-in" by prior visual knowledge. Thus, computing the 3D structures of all the objects in the scene from a single image is a fundamental problem in computer vision. In this thesis, we propose a bottom-up/top-down Bayesian inference framework to compute the 3D structures of objects in the scene from a single image, which integrates the involved visual tasks (segmentation, perceptual grouping, object detection and recognition, 3D reconstruction) in a principled way and incorporates the prior visual knowledge in the inference.; The output of the inference framework is a hierarchical "parsing graph" with the scene label at the top (or root), objects with 3D structures and their parts at intermediate nodes, and image pixels at the bottom. The number of layers in this parsing graph is determined by the types of objects or visual patterns. The nodes in this parsing graph correspond to visual patterns represented by probabilistic models. The parsing graph also has both top-down connections and horizontal spatial connections, which correspond to the generative models and spatial relations modeled by Markov Random Field (MRF) respectively.; Formulated in Bayesian framework, the inference algorithm computes the parsing graph from the input image by optimizing a posterior probability. In this optimization process, we integrate two popular computing paradigms in computer vision: generative methods, and discriminative methods. The former formulates the posterior probability to maximize in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative proposals using some bottom-up tests to drive the maximizing process in the solution space. Thus, the inference algorithm achieves both speed and consistency.; We also investigate three mechanisms to efficiently construct the parsing graph based on the properties of visual patterns being computed: bottom-up construction mechanism, top-down construction mechanism, and bottom-up/top-down construction mechanism.
机译:对于人类视觉来说,通常的经验是从单个2D图像中感知完整的3D形状和场景,并通过先验视觉知识将被遮挡的部分“填充”。因此,从单个图像计算场景中所有对象的3D结构是计算机视觉中的一个基本问题。在本文中,我们提出了一个自下而上/自上而下的贝叶斯推理框架,用于从单个图像中计算场景中对象的3D结构,该框架集成了涉及的视觉任务(细分,感知分组,对象检测和识别,3D)重建),并在推论中纳入先验的视觉知识。推理框架的输出是一个层次“解析图”,其中场景标签位于顶部(或根),具有3D结构及其部分的对象位于中间节点,图像像素位于底部。此分析图中的层数由对象或视觉图案的类型确定。该解析图中的节点对应于由概率模型表示的视觉模式。解析图还具有自上而下的连接和水平空间的连接,分别对应于由马尔可夫随机场(MRF)建模的生成模型和空间关系。推理算法采用贝叶斯框架表示,通过优化后验概率从输入图像计算解析图。在此优化过程中,我们在计算机视觉中集成了两种流行的计算范例:生成方法和判别方法。前者根据由似然函数和先验定义的图像生成模型,制定后验概率最大化。后者使用一些自下而上的测试来计算有区别的建议,以推动解决方案空间中的最大化过程。因此,推理算法既实现速度又保持一致性。我们还研究了三种基于要计算的可视模式的属性有效构造解析图的机制:自下而上的构造机制,自上而下的构造机制和自下而上/自上而下的构造机制。

著录项

  • 作者

    Han, Feng.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 132 p.
  • 总页数 132
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号