首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition Workshops >Temporal Domain Neural Encoder for Video Representation Learning
【24h】

Temporal Domain Neural Encoder for Video Representation Learning

机译:用于视频表示学习的时域神经编码器

获取原文

摘要

We address the challenge of learning good video representations by explicitly modeling the relationship between visual concepts in time space. We propose a novel Temporal Preserving Recurrent Neural Network (TPRNN) that extracts and encodes visual dynamics with frame-level features as input. The proposed network architecture captures temporal dynamics by keeping track of the ordinal relationship of co-occurring visual concepts, and constructs video representations with their temporal order patterns. The resultant video representations effectively encode temporal information of dynamic patterns, which makes them more discriminative to human actions performed with different sequences of action patterns. We evaluate the proposed model on several real video datasets, and the results show that it successfully outperforms the baseline models. In particular, we observe significant improvement on action classes that can only be distinguished by capturing the temporal orders of action patterns.
机译:通过在时空中显式建模视觉概念之间的关系,我们解决了学习良好视频表示形式的挑战。我们提出了一种新颖的临时保存递归神经网络(TPRNN),该方法可以提取和编码具有帧级特征作为输入的视觉动态。所提出的网络体系结构通过跟踪共同出现的视觉概念的序数关系来捕获时间动态,并使用其时间顺序模式构造视频表示。所得的视频表示有效地编码了动态模式的时间信息,这使它们对于以不同的动作模式序列执行的人类动作更具区分性。我们在几个真实的视频数据集上评估了提出的模型,结果表明该模型成功地胜过了基线模型。特别是,我们观察到动作类的重大改进,这只能通过捕获动作模式的时间顺序来区分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号