...
首页> 外文期刊>Journal of visual communication & image representation >Language-guided graph parsing attention network for human-object interaction recognition
【24h】

Language-guided graph parsing attention network for human-object interaction recognition

机译:面向人客体交互识别的语言引导图解析注意力网络

获取原文
获取原文并翻译 | 示例

摘要

This paper focuses on the task of human-object interaction (HOI) recognition, which aims to classify the interaction between human and objects. It is a challenging task partially due to the extremely imbalanced data among classes. To solve this problem, we propose a language-guided graph parsing attention network (LG-GPAN) that makes use of the word distribution in language to guide the classification in vision. We first associate each HOI class name with a word embedding vector in language and then all the vectors can construct a language space specified for HOI recognition. Simultaneously, the visual feature is extracted from the inputs via the proposed graph parsing attention network (GPAN) for better visual representation. The visual feature is then transformed into the linguistic one in language space. Finally, the output score is obtained via measuring the distance between the linguistic feature and the word embedding of classes in language space. Experimental results on the popular CAD-120 and V-COCO datasets validate our design choice and demonstrate its superior performance in comparison to the state-of-the-art.
机译:本文重点介绍人物交互(HOI)识别任务,旨在对人与物体之间的交互进行分类。这是一项具有挑战性的任务,部分原因是班级之间的数据极度不平衡。为了解决这个问题,我们提出了一种语言引导的图解析注意力网络(LG-GPAN),该网络利用语言中的单词分布来指导视觉中的分类。我们首先将每个 HOI 类名与语言中的单词嵌入向量相关联,然后所有向量都可以构造一个指定用于 HOI 识别的语言空间。同时,通过所提出的图解析注意力网络(GPAN)从输入中提取视觉特征,以获得更好的视觉表示。然后,视觉特征在语言空间中转化为语言特征。最后,通过测量语言特征与语言空间中类的词嵌入之间的距离来获得输出分数。在流行的 CAD-120 和 V-COCO 数据集上的实验结果验证了我们的设计选择,并证明了其与最先进的技术相比具有卓越的性能。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号