Language-guided graph parsing attention network for human-object interaction recognition

Li Qiyue; Xie Xuemei; Zhang JinShi Guangming

首页> 外文期刊>Journal of visual communication & image representation >Language-guided graph parsing attention network for human-object interaction recognition

【24h】

Language-guided graph parsing attention network for human-object interaction recognition

机译：面向人客体交互识别的语言引导图解析注意力网络

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相关主题

摘要

This paper focuses on the task of human-object interaction (HOI) recognition, which aims to classify the interaction between human and objects. It is a challenging task partially due to the extremely imbalanced data among classes. To solve this problem, we propose a language-guided graph parsing attention network (LG-GPAN) that makes use of the word distribution in language to guide the classification in vision. We first associate each HOI class name with a word embedding vector in language and then all the vectors can construct a language space specified for HOI recognition. Simultaneously, the visual feature is extracted from the inputs via the proposed graph parsing attention network (GPAN) for better visual representation. The visual feature is then transformed into the linguistic one in language space. Finally, the output score is obtained via measuring the distance between the linguistic feature and the word embedding of classes in language space. Experimental results on the popular CAD-120 and V-COCO datasets validate our design choice and demonstrate its superior performance in comparison to the state-of-the-art.

机译：本文重点介绍人物交互（HOI）识别任务，旨在对人与物体之间的交互进行分类。这是一项具有挑战性的任务，部分原因是班级之间的数据极度不平衡。为了解决这个问题，我们提出了一种语言引导的图解析注意力网络（LG-GPAN），该网络利用语言中的单词分布来指导视觉中的分类。我们首先将每个 HOI 类名与语言中的单词嵌入向量相关联，然后所有向量都可以构造一个指定用于 HOI 识别的语言空间。同时，通过所提出的图解析注意力网络（GPAN）从输入中提取视觉特征，以获得更好的视觉表示。然后，视觉特征在语言空间中转化为语言特征。最后，通过测量语言特征与语言空间中类的词嵌入之间的距离来获得输出分数。在流行的 CAD-120 和 V-COCO 数据集上的实验结果验证了我们的设计选择，并证明了其与最先进的技术相比具有卓越的性能。

著录项

来源
《Journal of visual communication & image representation》 |2022年第11期|1.1-1.10|共10页
作者
Li Qiyue; Xie Xuemei; Zhang JinShi Guangming;
展开▼
作者单位

Xidian Univ;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词
Human-object interaction; Language-guided; Graph parsing attention network; Word embedding;

机译：人机交互;语言指导;图解析注意力网络;词嵌入;

Language-guided graph parsing attention network for human-object interaction recognition

摘要

著录项

引文网络

相关主题

期刊订阅