Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images

机译：PIX2VOX：来自单个和多视图图像的上下文感知3D重建

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recovering the 3D representation of an object from single-view or multi-view RGB images by deep neural networks has attracted increasing attention in the past few years. Several mainstream works (e.g., 3D-R2N2) use recurrent neural networks (RNNs) to fuse multiple feature maps extracted from input images sequentially. However, when given the same set of input images with different orders, RNN-based approaches are unable to produce consistent reconstruction results. Moreover, due to long-term memory loss, RNNs cannot fully exploit input images to refine reconstruction results. To solve these problems, we propose a novel framework for single-view and multi-view 3D reconstruction, named Pix2Vox. By using a well-designed encoder-decoder, it generates a coarse 3D volume from each input image. Then, a context-aware fusion module is introduced to adaptively select high-quality reconstructions for each part (e.g., table legs) from different coarse 3D volumes to obtain a fused 3D volume. Finally, a refiner further refines the fused 3D volume to generate the final output. Experimental results on the ShapeNet and Pix3D benchmarks indicate that the proposed Pix2Vox outperforms state-of-the-arts by a large margin. Furthermore, the proposed method is 24 times faster than 3D-R2N2 in terms of backward inference time. The experiments on ShapeNet unseen 3D categories have shown the superior generalization abilities of our method.

机译：通过深神经网络从单视图或多视图RGB图像中恢复对象的3D表示，在过去几年中引起了不断的关注。若干主流工作（例如，3D-R2N2）使用经常性神经网络（RNN）来顺序地熔断从输入图像中提取的多个特征映射。但是，当给定具有不同订单的相同输入图像集时，基于RNN的方法无法产生一致的重建结果。此外，由于长期内存损耗，RNN不能完全利用输入图像来细化重建结果。为解决这些问题，我们提出了一个名为Pix2vox的单视图和多视图3D重建的新颖框架。通过使用精心设计的编码器解码器，它产生来自每个输入图像的粗略3D体积。然后，引入上下文感知的融合模块，以自适应地为来自不同粗略3D卷的每个部分（例如，表腿）的高质量重建来获得熔融的3D体积。最后，炼油厂还改进了融合的3D体积以产生最终输出。 ShapEnet和PIX3D基准上的实验结果表明，所提出的Pix2Vox通过大边缘优于最先进的。此外，在后向推理时间方面，所提出的方法比3D-R2N2快24倍。 ShapeNet看不见的3D类别的实验表明了我们方法的卓越泛化能力。

著录项

来源
《International Conference on Computer Vision》|2019年|1 v.|共9页
会议地点
作者
Haozhe Xie; Hongxun Yao; Xiaoshuai Sun; Shangchen Zhou; Shengping Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41;
关键词
Three-dimensional displays; Image reconstruction; Shape; Feature extraction; Decoding; Solid modeling; Kernel;

机译：三维显示;图像重建;形状;特征提取;解码;实体建模;内核;

相似文献

外文文献
中文文献
专利

1. Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images [J] . Xie Haozhe, Yao Hongxun, Zhang Shengping, International Journal of Computer Vision . 2020,第12期

机译：PIX2VOX ++：单尺度上下文感知3D对象重建单个和多个图像
2. Multi-view self-supervised learning for 3D facial texture reconstruction from single image [J] . Zeng Xiaoxing, Hu Ruyun, Shi Wu, Image and Vision Computing . 2021,第Nova期

机译：单幅图像3D面部纹理重建的多视图自我监督学习
3. Automatic 3D building reconstruction from multi-view aerial images with deep learning [J] . Yu Dawen, Ji Shunping, Liu Jin, ISPRS Journal of Photogrammetry and Remote Sensing . 2021,第Jana期

机译：深度学习多视图空中图像自动3D建筑重建
4. Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images [C] . Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, International Conference on Computer Vision . 2019

机译：Pix2Vox：从单视图和多视图图像进行上下文感知的3D重构
5. 3D SEM Surface Reconstruction from Multi-view Images [D] . Rehman, Waleedur ur. 2018

机译：来自多视图图像的3D SEM表面重建
6. New microangiography system development providing improved small vessel imaging increased contrast to noise ratios and multi-view 3D reconstructions [O] . Andrew T. Kuhls, Vikas Patel, Ciprian Ionita, -1

机译：新的微血管造影系统开发可提供改进的小血管成像增强的噪声比对比度以及多视图3D重建
7. Multi-view Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People [O] . Akin Caliskan, Armin Mustafa, Evren Imre, 2021

机译：改进衣服的单图像三维重建的多视图一致性损失

Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images

摘要

著录项

相似文献

相关主题

期刊订阅