The objective of this work is to develop a vision system for 3-D scene reconstruction from sequences of binocular noisy images. The vision system assumes that the positions of the cameras are known with uncertainties, and uses the camera parameters in the process of 3-D scene reconstruction. In contrast to other vision systems that utilize multiple images, this system does not establish correspondence between pixels in image pairs, but considers all possible correspondences (based on camera parameters) and assigns a confidence value to each possible correspondence. Confidence values are assigned only to significant intensity changes. The confidences are a function of the number of possible correspondences and similarity in intensity changes of candidate pixels. For computational efficiency, assignment of the confidence values is accomplished in a hierarchical fashion, by considering images at the lower resolution first. Each significant intensity change at the lower resolution is associated with a group of pixels at the higher resolution. This association is the function of the multi-resolution image generation process, in this case Gaussian pyramid, and it determines the positions of the pixels to be considered when establishing confidences at the higher resolution. At a particular level of the hierarchy, the vision system incorporates the information provided by image sequences by fusing confidence values assigned to possible correspondences in sequences of image pairs. The results of fusion guide matching at the next level of the resolution hierarchy.; The fusion process employs an optimal distributed fusion algorithm. The dynamic models for information fusion are derived using 3-D geometrical modeling assuming probabilistic description of 3-D uncertainty volumes. The fusion process is dynamic so that the fused confidences are continuously updated when new information is available. In the fusion process, the confidence values of the correct correspondences are increased and the confidence values of false correspondences are decreased. At the end of the fusion process, the correspondences with high confidence values provide correct 3-D information about the scene. The objects in the scene are reconstructed by using moving-least-square surface interpolation over a sparse set of calculated 3-D points.; The experimental results obtained are very satisfactory. For the binocular noisy image sequences with a noise standard deviation of 20, the percentage of correct matches is over 95% and the average error of calculated 3-D distances to the objects is less than 5%. The vision system is robust to the presence of noise in the images, since it fuses information provided by the image pairs and image sequences. It is also computationally efficient since it employs a resolution hierarchy and utilizes the distributed fusion algorithms which can be implemented in parallel.
展开▼