首页> 外文会议>IEEE International Conference on Computer Vision Workshops >Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image
【24h】

Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image

机译:逆像视觉:从单个图像获取场景的丰富3D解释

获取原文

摘要

We develop an inverse graphics approach to the problem of scene understanding, obtaining a rich representation that includes descriptions of the objects in the scene and their spatial layout, as well as global latent variables like the camera parameters and lighting. The framework's stages include object detection, the prediction of the camera and lighting variables, and prediction of object-specific variables (shape, appearance and pose). This acts like the encoder of an autoencoder, with graphics rendering as the decoder Importantly the scene representation is interpretable and is of variable dimension to match the detected number of objects plus the global variables. For the prediction of the camera latent variables we introduce a novel architecture termed Probabilistic HoughNets (PHNs), which provides a principled approach to combining information from multiple detections. We demonstrate the quality of the reconstructions obtained quantitatively on synthetic data, and qualitatively on real scenes.
机译:我们开发了一种逆向图形方法来解决场景理解问题,获得了丰富的表示形式,其中包括场景中对象的描述及其空间布局,以及像摄像机参数和照明这样的全局潜在变量。框架的阶段包括对象检测,相机和照明变量的预测以及特定于对象的变量(形状,外观和姿势)的预测。这就像自动编码器的编码器一样,以图形渲染作为解码器。重要的是,场景表示形式是可解释的,并且具有可变的维度,以匹配检测到的对象数和全局变量。为了预测摄像机的潜在变量,我们引入了一种称为概率HoughNets(PHN)的新颖体系结构,该体系结构提供了一种原理方法来组合来自多个检测的信息。我们证明了在合成数据上定量获得的重建质量,以及在真实场景上定性获得的重建质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号