Hand pose estimation from a single depth image is an essential topic incomputer vision and human computer interaction. Despite recent advancements inthis area promoted by convolutional neural network, accurate hand poseestimation is still a challenging problem. In this paper we propose a Poseguided structured Region Ensemble Network (Pose-REN) to boost the performanceof hand pose estimation. The proposed method extracts regions from the featuremaps of convolutional neural network under the guide of an initially estimatedpose, generating more optimal and representative features for hand poseestimation. The extracted feature regions are then integrated hierarchicallyaccording to the topology of hand joints by employing tree-structured fullyconnections. A refined estimation of hand pose is directly regressed by theproposed network and the final hand pose is obtained by utilizing an iterativecascaded method. Comprehensive experiments on public hand pose datasetsdemonstrate that our proposed method outperforms state-of-the-art algorithms.
展开▼