Recognizing Gestures from Videos using a Network with Two-branch Structure and Additional Motion Cues

机译：使用具有双分支结构和其他运动提示的网络识别来自视频的手势

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a method for recognizing gestures from videos which implicitly incorporates multimodal data during training, and makes classification by only using RGB modality data. The network is 3d-convolutional, and includes a shared network for implicitly incorporating multiple modalities, a generation branch for estimating motion regions and a classification branch for classifying gestures. We introduce a type of efficient modality data, binarized motion cues, which include information of moving hand regions, and are learned by using the generation network. The binarized motion cues are given as extra supervision for learning motion in the generation branch. Since features of additional motion cues learned by the generation branch are implicitly fused with features learned by the classification branch, the classification performance can be improved. Experimental results showed that the shared network can extract more discriminable intermediate features, and the network with the classification branch can achieve improved performance by only using RGB modality input data.

机译：在本文中，我们提出了一种用于识别来自视频中隐式融合多模式数据的视频的手势，并仅使用RGB模态数据进行分类。该网络是3D卷积的，并且包括用于隐式地结合多个模态的共享网络，用于估计运动区域的生成分支和用于对手势进行分类分支。我们介绍了一种类型的有效的模态数据，二值化运动提示包括移动手区域的信息，并通过使用生成网络来学习。二值化运动提示作为在生成分支中学习运动的额外监督。由于生成分支学习的附加运动提示的特征是隐式地与分类分支学习的特征融合，因此可以提高分类性能。实验结果表明，共享网络可以提取更可分辨力的中间特征，并且具有分类分支的网络可以仅通过使用RGB模态输入数据来实现改进的性能。

著录项

来源
《International Conference on Automatic Face and Gesture Recognition》|2020年|133-137|共5页
会议地点
作者
Jiaxin Zhou; Takashi Komuro;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Optical imaging; Videos; Data mining; Training; Optical fiber networks; Shape;

机译：特征提取;光学成像;视频;数据挖掘;训练;光纤网络;形状;

相似文献

外文文献
中文文献
专利

1. Dogs fail to recognize a human pointing gesture in two-dimensional depictions of motion cues [J] . Eatherington Carla J., Mongillo Paolo, Looke Miina, Behavioural processes . 2021,第1期

机译：狗未能识别在运动线索的二维描绘中的人类指向姿态
2. Recognizing human gestures in videos by modeling the mutual context of body position and hands movement [J] . Gavrilescu Mihai Multimedia Systems . 2017,第3期

机译：通过对身体位置和手部动作的相互关系进行建模来识别视频中的人的手势
3. Recognizing and Presenting the Storytelling Video Structure With Deep Multimodal Networks [J] . Lorenzo Baraldi, Costantino Grana, Rita Cucchiara Multimedia, IEEE Transactions on . 2017,第5期

机译：借助深度多模态网络识别并呈现讲故事的视频结构
4. Recognizing emotions from videos by studying facial expressions, body postures and hand gestures [C] . Mihai Gavrilescu Telecommunications Forum . 2015

机译：通过研究面部表情，身体姿势和手势来识别视频中的情绪
5. Using Audio Cues to Support Motion Gesture Interaction on Mobile Devices. [D] . Morrison-Smith, Sarah. 2015

机译：使用音频提示来支持移动设备上的手势交互。
6. GestuRe and ACtion Exemplar (GRACE) video database: stimuli for research on manners of human locomotion and iconic gestures [O] . Suzanne Aussems, Natasha Kwok, Sotaro Kita -1

机译：GestuRe和ACtion示例（GRACE）视频数据库：用于刺激人类运动和标志性手势方式的刺激
7. Recognizing and Presenting the Storytelling Video Structure with Deep Multimodal Networks [O] . Baraldi, Lorenzo, Grana, Costantino, Cucchiara, Rita 2016

机译：用深度多模态网络识别并呈现讲故事的视频结构
8. Preliminary Report on Prospects for Additional Networks. Appendices. Home Video: A Report on the Status, Projected Development and Consumer Use of Videocassette Recorders and Videodisc Players. Program Distribution, Scheduling, and Production Support in the Public Television System [R] . 1980

机译：其他网络前景初步报告。附录。家庭视频：关于录像机和录像机播放器的状态，预计开发和消费者使用情况的报告。公共电视系统中的节目分发，调度和制作支持

Recognizing Gestures from Videos using a Network with Two-branch Structure and Additional Motion Cues

摘要

著录项

相似文献

相关主题

期刊订阅