首页> 外文会议>IEEE International Conference on Multimedia and Expo >Multimodal-Semantic Context-Aware Graph Neural Network for Group Activity Recognition
【24h】

Multimodal-Semantic Context-Aware Graph Neural Network for Group Activity Recognition

机译:多模式 - 语义背景感知图形神经网络用于组活动识别

获取原文

摘要

Group activities in videos involve visual interaction contexts in multiple modalities between actors, and co-occurrence between individual action labels. However, most of the current group activity recognition methods either model actor-actor relations based on the single RGB modality, or ignore exploiting the label relationships. To capture these rich visual and semantic contexts, we propose a multimodal-semantic context-aware graph neural network (MSCA-GNN). Specifically, we first build two visual sub-graphs based on the appearance cues and motion patterns extracted from RGB and optical-flow modalities, respectively. Then, two attention-based aggregators are proposed to refine each node, by gathering representations from other nodes and heterogeneous modalities. In addition, a semantic graph is constructed based on linguistic embeddings to model label relationships. We employ a bi-directional mapping learning strategy to further integrate the information from both multimodal visual and semantic graphs. Experimental results on two group activity benchmarks show the effectiveness of the proposed method.
机译:视频中的集团活动涉及在演员之间的多种模式中的视觉交互上下文,以及各个行动标签之间的共同发生。但是,大多数当前组活动识别方法都是基于单个RGB模态的演员 - 演员关系,或忽略利用标签关系。为了捕获这些丰富的视觉和语义上下文,我们提出了一个多模式语义上下文知识图形神经网络(MSCA-GNN)。具体地,我们首先基于从RGB和光流模式提取的外观提示和运动模式构建两个视觉子图。然后,提出了两个基于注意的聚合器来通过收集来自其他节点和异构模式的表示来改进每个节点。此外,基于语言嵌入式构建语义图形,以模拟标签关系。我们采用双向映射学习策略,以进一步集成多峰视觉和语义图中的信息。两组活动基准的实验结果表明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号