This thesis investigates the problem of search and convergence complexity for algorithms which detect deformable visual objects, such as hands, in images. This problem is of great practical importance for the design of artificial vision systems and also throws light on the biology of human perception. The thesis formulates vision as a decoding problem where the goal is to determine information about the world from intensity patterns reaching the eye or camera.; Deformable template models are used to represent the objects by encoding probabilities for their shape and appearance. These models are partially specified by the user and partially learned from representative image data. Bayesian probability theory is used to synthesize shapes from these models and verify their plausibility.; The thesis gives a framework for detecting deformable shapes in terms of the A* search procedure. It proves that many current vision search algorithms, such as twenty questions, dynamic programming, and Dijkstra, are special cases of A*. Using this framework, and techniques adapted from information theory, it is possible to prove expected convergence times of search algorithms and to define a measure of search complexity for deformable shapes (analogous to an order parameter in statistical physics). These theoretical results demonstrate the existence of search algorithms with acceptable time complexity which can detect these deformable shapes.; The theory is illustrated by computer experiments using dynamic programming and A* for detecting shapes such as hands and cat ears. The experiments show that the algorithms can deal with significant shape deformations, large occlusions of the target objects, and the presence of multiple targets.
展开▼