首页> 外文期刊>IEEE Transactions on Image Processing >Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition
【24h】

Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition

机译:学习语义-保持注意和上下文交互以进行小组活动识别

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we investigate the problem of group activity recognition by learning semantics-preserving attention and contextual interaction among different people. Conventional methods usually aggregate the features extracted from individual persons by pooling operations, which lack physical meaning and cannot fully explore the contextual information for group activity recognition. To address this, we develop a Semantics-Preserving Teacher-Student (SPTS) networks architecture. Our SPTS networks first learn a Teacher Network in the semantic domain that classifies the word of group activity based on the words of individual actions. Then, we design a Student Network in the appearance domain that recognizes the group activity according to the input video. We enforce the Student Network to mimic the Teacher Network in the learning procedure. In this way, we allocate semantics-preserving attention to different people, which is more effective to seek the key people and discard the misleading people, while no extra labeled data are required. Moreover, a group of people inherently lie in a graph-based structure, where the people and their relationship can he regarded as the nodes and edges of a graph, respectively. Based on this, we build two graph convolutional modules on both the Teacher Network and the Student Network to reason the dependency among different people. Furthermore, we extend our approach on action segmentation task based on its intermediate features. The experimental results on four datasets for group activity analysis clearly show the superior performance of our method in comparison with the state-of-the-art.
机译:在本文中,我们通过学习语义保留的注意力和不同人之间的上下文交互来研究小组活动识别的问题。常规方法通常汇总通过合并操作从个人提取的特征,这些特征缺乏物理意义,并且不能充分探索用于团体活动识别的上下文信息。为了解决这个问题,我们开发了一种语义保存师生(SPTS)网络体系结构。我们的SPTS网络首先在语义域中学习一个教师网络,该网络根据单个动作的单词对小组活动的单词进行分类。然后,我们在外观域中设计一个学生网络,该网络根据输入的视频识别出小组活动。我们强制学生网络在学习过程中模仿教师网络。通过这种方式,我们将保留语义的注意力分配给不同的人,这在寻找关键人物并丢弃误导性人物时更加有效,而无需额外的标签数据。此外,一群人天生就位于基于图的结构中,其中的人及其关系可以分别视为图的节点和边缘。基于此,我们在教师网络和学生网络上构建了两个图卷积模块,以推论不同人之间的依赖关系。此外,我们基于动作细分任务的中间特征扩展了我们的方法。在四个用于团体活动分析的数据集上的实验结果清楚地表明,与最新技术相比,我们的方法具有优越的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号