首页> 外文期刊>Neural processing letters >Spatiotemporal Fusion Networks for Video Action Recognition
【24h】

Spatiotemporal Fusion Networks for Video Action Recognition

机译:用于视频动作识别的时空融合网络

获取原文
获取原文并翻译 | 示例

摘要

Learning spatiotemporal information is a fundamental part in action recognition. In this work, we attempt to extract efficient spatiotemporal information for video representation through a novel architecture, termed as SpatioTemporal Fusion Networks (STFN). STFN extract spatiotemporal information by introducing connections between the spatial and temporal streams in two-stream networks with fusion blocks, called as Compactly Fuse Spatial and Temporal information (CFST) block, whose goal is to integrate spatial and temporal information with little computational cost. CFST is built upon Compact Bilinear Pooling which can capture multiplicative interactions at corresponding locations. For better integration of two streams, we make an exploration of fusion configuration about where to insert fusion block and a combination of CFST block and additive interaction. We evaluate our proposed architecture on UCF-101 and HMDB-51, and obtain a comparable performance.
机译:学习时空信息是行动认可的基本一部分。在这项工作中,我们试图通过新颖的架构提取用于视频表示的有效的时空信息,称为时尚融合网络(STFN)。 STFN通过在具有融合块的两流网络中引入空间和时间流之间的连接来提取时空信息,称为紧凑的熔丝空间和时间信息(CFST)块,其目标是通过几乎没有计算成本集成空间和时间信息。 CFST建立在紧凑的双线性池中,可以捕获相应位置的乘法相互作用。为了更好地集成两条流,我们对融合配置进行了探索,即在其中插入融合块和CFST块和附加交互的组合。我们在UCF-101和HMDB-51上评估我们所提出的架构,并获得可比性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号