In most object recognition systems, interactions between objects in a scene are ignored and the best interpretation is considered to be the set of hypothesized objects that matches the greatest number of image features. Visual and physical interactions, however, provide a rich source of information: occlusion explains why features might be undetected, and physical constraints ensure a realisable interpretation. We show how these interations can be easily modeled using a Bayesian network, and how the problem of interpretation can be cast as finding the most likely explanation for such a network.
展开▼