首页> 外国专利> Video action detection method based on convolutional neural network

Video action detection method based on convolutional neural network

机译:基于卷积神经网络的视频动作检测方法

摘要

A video action detection method based on a convolutional neural network (CNN) is disclosed in the field of computer vision recognition technologies. A temporal-spatial pyramid pooling layer is added to a network structure, which eliminates limitations on input by a network, speeds up training and detection, and improves performance of video action classification and time location. The disclosed convolutional neural network includes a convolutional layer, a common pooling layer, a temporal-spatial pyramid pooling layer and a full connection layer. The outputs of the convolutional neural network include a category classification output layer and a time localization calculation result output layer. The disclosed method does not require down-sampling to obtain video clips of different durations, but instead utilizes direct input of the whole video at once, improving efficiency. Moreover, the network is trained by using video clips of the same frequency without increasing differences within a category, thus reducing the learning burden of the network, achieving faster model convergence and better detection.
机译:在计算机视觉识别技术领域中,公开了一种基于卷积神经网络(CNN)的视频动作检测方法。时空金字塔池层被添加到网络结构中,从而消除了对网络输入的限制,加快了训练和检测的速度,并提高了视频动作分类和时间定位的性能。所公开的卷积神经网络包括卷积层,公共池化层,时空金字塔池化层和完整连接层。卷积神经网络的输出包括类别分类输出层和时间本地化计算结果输出层。所公开的方法不需要下采样来获得不同持续时间的视频剪辑,而是立即利用整个视频的直接输入,从而提高了效率。而且,通过使用相同频率的视频剪辑来训练网络,而不会增加类别内的差异,从而减轻了网络的学习负担,实现了更快的模型收敛和更好的检测。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号