Complete scene reconstruction from single view RGBD is a challenging task, requiringudestimation of scene regions occluded from the captured depth surface. We proposeudthat scene-centric analysis of human motion within an indoor scene can reveal fully occludedudobjects and provide functional cues to enhance scene understanding tasks. Capturedudskeletal joint positions of humans, utilised as naturally exploring active sensors,udare projected into a human-scene motion representation. Inherent body occupancy isudleveraged to carve a volumetric scene occupancy map initialised from captured depth,udrevealing a more complete voxel representation of the scene. To obtain a structured boxudmodel representation of the scene, we introduce unique terms to an object detection optimisationudthat overcome depth occlusions whilst deriving from the same depth data. Theudmethod is evaluated on challenging indoor scenes with multiple occluding objects such asudtables and chairs. Evaluation shows that human-centric scene analysis can be applied toudeffectively enhance state-of-the-art scene understanding approaches, resulting in a moreudcomplete representation than single view depth alone.
展开▼