为检测出对噪声、镜头缩放更具鲁棒性的反映人体动作特征的时空兴趣点,首先提出了一种新的时空兴趣点检测器;然后以检测出的时空兴趣点为中心,建立基于多面体模型的时空梯度描述子来进一步刻画人体动作在时空上的视觉特征;再基于分层聚类树形结构、利用词袋方法对视频动作特征建立更大且更有效的码书;最后将特征描述子与高层次的人工定义的动作属性相结合,采用隐支持向量机结合坐标下降法求解最终识别模型的局部最优解.在几种典型数据库上的实验结果表明,文中方法具有较高的人体动作识别率.%In order to detect the spatio-temporal interest points that illustrate the characteristics of human action and possess robustness to noise and camera zooming, first, a novel detector for spatio-temporal interest points is proposed. Next, by centering on the detected spatio-temporal interest point, a polyhedron model-based spatio-temporal gradient descriptor is created to illustrate the spatio-temporal visual features of human action. Then, a larger and more efficient codebook of video action clips is constructed by using the Bag of Words method based on the hierarchical vocabulary tree. Finally, by integrating the descriptor with the high-level action attributes defined by human, the latent support vector machine combined with coordinate descent is adopted to find the local optimum of the prediction model. Experiments on some typical databases demonstrate that the proposed method achieves high recognition rate of human action.
展开▼