Three-dimensional human pose estimation(3 D HPE) has broad application prospects in the fields of trajectory prediction, posture tracking and action analysis. However, the frequent self-occlusions and the substantial depth ambiguity in two-dimensional(2 D) representations hinder the further improvement of accuracy. In this paper, we propose a novel video-based human body geometric aware network to mitigate the above problems. Our network can implicitly be aware of the geometric constraints of the human body by capturing spatial and temporal context information from 2 D skeleton data. Specifically, a novel skeleton attention(SA) mechanism is proposed to model geometric context dependencies among different body joints, thereby improving the spatial feature representation ability of the network. To enhance the temporal consistency, a novel multilayer perceptron(MLP)-Mixer based structure is exploited to comprehensively learn temporal context information from input sequences. We conduct experiments on publicly available challenging datasets to evaluate the proposed approach. The results outperform the previous best approach by 0.5 mm in the Human3.6 m dataset. It also demonstrates significant improvements in Human Eva-I dataset.
展开▼
机译:Reports on Applied Intelligence Findings from Northeastern University Provide New Insights (Adapted Human Pose: Monocular 3d Human Pose Estimation With Zero Real 3d Pose Data)
机译:New Findings from South China University of Technology in the Area of Networks Described (Pspdnet: Part-aware Shape and Pose Disentanglement Neural Network for 3d Human Animating Meshes)