首页> 外文期刊>Image and Vision Computing >Is that my hand? An egocentric dataset for hand disambiguation
【24h】

Is that my hand? An egocentric dataset for hand disambiguation

机译:那是我的手吗?以自我为中心的数据集

获取原文
获取原文并翻译 | 示例
           

摘要

With the recent development of wearable cameras, the interest for research on the egocentric perspective is increasing. This opens the possibility to work on a specific object detection problem of hand detection and hand disambiguation. However, recent progress in egocentric hand disambiguation and even hand detection, especially using deep learning, has been limited by the lack of a large dataset, with suitable variations in subject, activity, and scene. In this paper, we propose a dataset that simulates daily activities, with variable illumination and people from different cultures and ethnicity to address daily life conditions. We increase the dataset size from previous works to allow robust solutions like deep neural networks that need a substantial amount of data for training. Our dataset consists of 50,000 annotated images with 10 different subjects doing 5 different daily activities (biking, eating, kitchen, office and running) in over 40 different scenes with variable illumination and changing backgrounds, and we compare with previous similar datasets.Hands in an egocentric view are challenging to detect due to a number of factors, such as shape variations, inconsistent illumination, motion blur, and occlusion. To improve hand detection and disambiguation, context information can be included to aid in the detection. In particular, we propose three neural network architectures that jointly learn the hand and context information, and we provide baseline results with current object/hand detection approaches. (C) 2019 Elsevier B.V. All rights reserved.
机译:随着可穿戴式相机的最新发展,以自我为中心的观点进行研究的兴趣日益增加。这打开了解决手检测和手歧义消除的特定对象检测问题的可能性。但是,由于缺乏大型数据集,在主题,活动和场景方面存在适当的变化,以自我为中心的手消除歧义,甚至进行手检测(尤其是使用深度学习)的最新进展受到了限制。在本文中,我们提出了一个可模拟日常活动的数据集,该环境具有可变的光照以及来自不同文化和种族的人们来应对日常生活条件。我们从以前的工作中增加了数据集的大小,以提供强大的解决方案,例如需要大量数据进行训练的深度神经网络。我们的数据集包含50,000个带注释的图像,其中10个不同的对象在40个不同的场景中进行了5种不同的日常活动(自行车,饮食,厨房,办公室和跑步),这些场景具有可变的光照和变化的背景,并且与以前的类似数据集进行了比较。由于多种因素(例如形状变化,照明不一致,运动模糊和遮挡),以自我为中心的视图很难检测。为了改善手检测和消歧,可以包括上下文信息以帮助检测。特别是,我们提出了三种神经网络架构,它们可以共同学习手和上下文信息,并通过当前的对象/手检测方法提供基线结果。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号