首页> 外文期刊>Information Sciences: An International Journal >Deep attention network for joint hand gesture localization and recognition using static RGB-D images
【24h】

Deep attention network for joint hand gesture localization and recognition using static RGB-D images

机译:深度关注网络用于联合手势识别和识别使用静态RGB-D图像

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents an effective deep attention network for joint hand gesture localization and recognition using static RGB-D images. Our method trains a CNN framework based on a soft attention mechanism in an end-to-end manner, which is capable of automatically localizing hands and classifying gestures using a single network rather than relying on the conventional means of stage-wise hand segmentation/detection and classification. More precisely, our attention network first computes the weight for each proposal generated from the entire image, in order to judge the probability of the hand appearing in a given region. It then implements a global-sum operation for all proposals, which is influenced by their corresponding weights, in order to obtain a representation of the entire image. We demonstrate the feasibility and effectiveness of our method through extensive experiments on the NTU Hand Digits (NTU-HD) benchmark and the challenging HUST American Sign Language (HUST-ASL) dataset. Moreover, the proposed attention network is simple to train, without requiring bounding-box or segmentation mask annotations, which makes it easy to apply in hand gesture recognition systems. Based on the proposed attention network and taken RGB-D images as input, we obtain the state-of-the-art hand gesture recognition performance on the challenging HUST-ASL dataset. (C) 2018 Elsevier Inc. All rights reserved.
机译:本文介绍了一种有效的深度关注网络,用于使用静态RGB-D图像进行联合手势定位和识别。我们的方法通过以端到端的方式基于软关注机制来训练CNN框架,其能够自动定位手和使用单个网络进行分类手势,而不是依赖于阶段明智的手部分割/检测的传统手段和分类。更确切地说,我们的注意网络首先计算从整个图像产生的每个提议的权重,以便判断出现在给定区域中的手的概率。然后,它实现了所有提案的全局和操作,其受它们的相应权重的影响,以获得整个图像的表示。我们通过对NTU手数(NTU-HD)基准和挑战性普美人的手语(Hust-ASL)数据集进行了广泛的实验,展示了我们方法的可行性和有效性。此外,所提出的注意网络旨在训练,不需要边界盒或分割掩模注释,这使得在手势识别系统中易于施加。基于所提出的注意网络并将RGB-D图像作为输入,我们在具有挑战性的Hust-ASL数据集上获得最先进的手势识别性能。 (c)2018年Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号