With the rapid development of modern computers and networks, a great deal of research has been focused on the transformation from 2-D visual applications to the 3-D visual world. The need for image coding is naturally magnified when dealing with 3-D applications that employ a large number of highly correlated 2-D images. Recent studies have shown that in multi-view image coding, if the 3-D scene geometry is known, the coding efficiency, decoding speed and rendering visual quality can be dramatically improved. Motivated by this exciting observation and related research problems in existing 3-D geometry based multi-view coding schemes, this dissertation proposes and develops a multi-view coding system that operates directly in the 3-D space based on automatic 3-D scene reconstruction.; Furthermore, this dissertation makes two contributions in the field of automatic 3-D scene reconstruction. First, a new multistage self-calibration algorithm is proposed. We derive a polynomial optimization function of the intrinsic parameters that makes the optimization simple and insensitive to the initialization. Then, based on a stability analysis of the intrinsic parameters, a multistage procedure to refine the self-calibration is proposed. Second, we present a new proof that there are only four possible solutions in recovering the camera relative motion from the essential matrix. The new proof concentrates on the geometry among the essential-matrix, the camera rotation, and the camera translation. In addition, we provide a generalized SVD-based proof for the four possible solutions in decomposing the essential matrix.; We propose a multi-view image coding system in 3-D space based on automatic 3-D scene reconstruction. We establish a unifying 3-D scene voxel model for all the available image views and then encode the 3-D scene voxel model and the residual data (optional) for compression. There are several advantages of the 3-D voxel model over the mesh model as well as the texture data, which are applied in many existing multi-view image coding systems. First, the 3-D voxel model is much simpler than the mesh model in structure. Second, reconstruction of the original images or generation of synthetic images from the 3-D voxel model is straightforward. It can be achieved by re-projecting the 3-D model back to the image planes; meanwhile image reconstruction from the mesh model requires mapping the texture data to the mesh model. Third, since the 3-D voxel model is an extension from 2-D data to 3-D data, many existing techniques for the image/video coding can be applied for the coding of the 3-D voxel model. (Abstract shortened by UMI.)
展开▼