Gestures In-The-Wild: Detecting Conversational Hand Gestures in Crowded Scenes Using a Multimodal Fusion of Bags of Video Trajectories and Body Worn Acceleration

首页> 外文期刊>IEEE transactions on multimedia >Gestures In-The-Wild: Detecting Conversational Hand Gestures in Crowded Scenes Using a Multimodal Fusion of Bags of Video Trajectories and Body Worn Acceleration

【24h】

Gestures In-The-Wild: Detecting Conversational Hand Gestures in Crowded Scenes Using a Multimodal Fusion of Bags of Video Trajectories and Body Worn Acceleration

机译：野生手势：使用视频轨迹袋和身体加速加速度的多模式融合检测拥挤场景中的会话式手势

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the detection of hand gestures during free-standing conversations in crowded mingle scenarios. Unlike the scenarios of the previous works in gesture detection and recognition, crowded mingle scenes have additional challenges such as cross-contamination between subjects, strong occlusions, and nonstationary backgrounds. This makes them more complex to analyze using computer vision techniques alone. We propose a multimodal approach using video and wearable acceleration data recorded via smart badges hung around the neck. In the video modality, we propose to treat noisy dense trajectories as bags-of-trajectories. For a given bag, we can have good trajectories corresponding to the subject, and bad trajectories due for instance to cross-contamination. However, we hypothesize that for a given class, it should be possible to learn trajectories that are discriminative while ignoring noisy trajectories. We do this by exploiting multiple instance learning via embedded instance selection as our multiple instance learning approach. This technique also allows us to identify which instances contribute more to the classification. By fusing the decisions of the classifiers from the video and wearable acceleration modalities, we show improvements over the unimodal approaches with an AUC of 0.69. We also present a static analysis and a dynamic analysis to assess the impact of noisy data on the fused detection results, showing that the moments of high occlusion in the video are compensated by the information from the wearables. Finally, we applied our method to detect speaking status, leveraging the close relationship found in the literature between hand gestures and speech.

机译：本文讨论了在人群混杂场景中的独立对话中手势的检测。与以前的手势检测和识别工作场景不同，拥挤的混合场景具有其他挑战，例如对象之间的交叉污染，强烈的遮挡和不稳定的背景。这使得它们单独使用计算机视觉技术进行分析变得更加复杂。我们提出了一种多模式方法，该方法使用通过挂在脖子上的智能徽章记录的视频和可穿戴加速度数据。在视频模态中，我们建议将嘈杂的密集轨迹视为轨迹包。对于给定的包，我们可能具有与主题相对应的良好轨迹，而由于交叉污染等原因会出现不良轨迹。但是，我们假设对于给定的类，应该有可能学习区分性的轨迹，而忽略嘈杂的轨迹。为此，我们通过嵌入式实例选择来利用多实例学习作为我们的多实例学习方法。该技术还使我们能够确定哪些实例对分类的贡献更大。通过将分类器的决策从视频和可穿戴式加速度模态融合在一起，我们显示了单模方法的改进，其AUC为0.69。我们还提出了静态分析和动态分析，以评估嘈杂数据对融合检测结果的影响，表明视频中的高遮挡时刻被可穿戴设备的信息所补偿。最后，我们利用手势和言语之间在文献中发现的紧密关系，利用我们的方法来检测说话状态。

著录项

来源
《IEEE transactions on multimedia》 |2020年第1期|138-147|共10页
作者

展开▼
作者单位

Delft Univ Technol Dept Intelligent Syst Delft Netherlands|Inst Tecnol Costa Rica Escuela Ingn Elect Cartago Costa Rica;

Delft Univ Technol Dept Intelligent Syst Delft Netherlands;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Trajectory; Acceleration; Gesture recognition; Noise measurement; Human computer interaction; Visualization; Feature extraction; Hand gestures; crowded mingles; dense trajectories; multiple instance learning; MILES; wearable acceleration;

机译：弹道;加速;手势识别;噪声测量;人机交互;可视化;特征提取;手势;拥挤的杂物密集的轨迹;多实例学习;英里;穿戴式加速度;

相似文献

外文文献
中文文献
专利

1. Algorithm of local features fusion and modified covariance-matrix technique for hand motion position estimation and hand gesture trajectory tracking approach [J] . Eman Thabet, Fatimah Khalid, Puteri Suhaiza Sulaiman, Multimedia Tools and Applications . 2021,第4期

机译：局部特征融合算法融合与修改协方差 - 矩阵技术，用于手动位置估计和手势轨迹跟踪方法
2. Gesture spotting with body-worn inertial sensors to detect user activities [J] . Junker H, Amft O, Lukowicz P, Pattern Recognition: The Journal of the Pattern Recognition Society . 2008,第6期

机译：带有穿戴式惯性传感器的手势识别功能可检测用户活动
3. Hand gesture recognition using multimodal data fusion and multiscale parallel convolutional neural network for human–robot interaction [J] . Gao Qing, Liu Jinguo, Ju Zhaojie Expert Systems . 2021,第5期

机译：使用多模式数据融合和多尺度并联卷积神经网络进行人体机器人交互的手势识别
4. Gesture Recognition in Ego-centric Videos Using Dense Trajectories and Hand Segmentation [C] . Baraldi Lorenzo, Paci Francesco, Serra Giuseppe, IEEE Conference on Computer Vision and Pattern Recognition Workshops . 2014

机译：使用密集轨迹和手部分割的以自我为中心的视频中的手势识别
5. Acceleration Based Hand Gesture Recognition System [D] . Kamireddy, Sai Charan Reddy. 2017

机译：基于加速的手势识别系统
6. Multi-target video-based face recognition and gesture recognition based on enhanced detection and multi-trajectory incremental learning [O] . Jirui Lin, Laiyuan Xiao, Tao Wu -1

机译：基于增强检测和多轨迹增量学习的基于视频的多目标人脸识别和手势识别
7. Gesture spotting with body-worn inertial sensors to detect user activities [O] . Junker H, Amft OD Oliver, Lukowicz P, 2008

机译：带有穿戴式惯性传感器的手势识别功能可检测用户活动

Gestures In-The-Wild: Detecting Conversational Hand Gestures in Crowded Scenes Using a Multimodal Fusion of Bags of Video Trajectories and Body Worn Acceleration

摘要

著录项

相似文献

相关主题

期刊订阅