In this paper, we propose a new method of retrieving multipart, articulate objects from images and video. The scheme is particularly well suited for analyzing images and video for objects that can pose differently with possible shape deformation and articulated motion. The scheme involves computing an invariant signature for each segmented region in the image, in a manner that is insensitive to translation, rotation, scale, and shear. Using circular crosscorrelation, these signatures can then be efficiently compared with that of user-defined regions of interest. Ambiguities between individual region matches are then resolved through relaxation labeling techniques. A final match is established when a collection of segmented regions conform to the query object, both in terms of local shape description and global structural relation. The scheme thus allows for articulated movement of object parts within the scene. The procedure is easy to implement, yet shows promising results in its ability to isolate interesting regions in images and video, to account for structural and relational constraints among regions, and to integrate both local shape and global structural information for a detailed examination of the scene in a way that is invariant to many visual variations.
展开▼