We propose an approach for view-invariant object detection directly in 3D with following properties: (ⅰ) The detection is based on matching of 3D contours to 3D object models, (ⅱ) The matching is constrained with qualitative spatial relations such as above/below, left/right, and front/back, (ⅲ) In order to ensure that any matching solution satisfies these constraints, we formulate the matching problem as finding maximum weight subgraphs with hard constraints, and utilize a novel inference framework to solve this problem. Given a single view of an RGB-D camera, we obtain 3D contours by "back projecting" 2D contours extracted in the depth map. As our experimented results demonstrate, the proposed approach significantly outperforms the state-of-the-art 2D approaches, in particular, latent SVM object detector, as well as recently proposed approaches for object detection in RGB-D data.
展开▼