Improved Deep Learning of Object Category Using Pose Information

机译：使用姿势信息改进了对象类别的深度学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Despite significant recent progress, the best available computer vision algorithms still lag far behind human capabilities, even for recognizing individual discrete objects under various poses, illuminations, and backgrounds. Here we present a new approach to using object pose information to improve deep network learning. While existing large-scale datasets, e.g. ImageNet, do not have pose information, we leverage the newly published turntable dataset, iLab-20M, which has 22M images of 704 object instances shot under different lightings, camera viewpoints and turntable rotations, to do more controlled object recognition experiments. We introduce a new convolutional neural network architecture, what/where CNN (2W-CNN), built on a linear-chain feedforward CNN (e.g., AlexNet), augmented by hierarchical layers regularized by object poses. Pose information is only used as feedback signal during training, in addition to category information, but is not needed during test. To validate the approach, we train both 2W-CNN and AlexNet using a fraction of the dataset, and 2W-CNN achieves 6% performance improvement in category prediction. We show mathematically that 2W-CNN has inherent advantages over AlexNet under the stochastic gradient descent (SGD) optimization procedure. Furthermore, we fine-tune object recognition on ImageNet by using the pretrained 2W-CNN and AlexNet features on iLab-20M, results show significant improvement compared with training AlexNet from scratch. Moreover, fine-tuning 2W-CNN features performs even better than fine-tuning the pretrained AlexNet features. These results show that pretrained features on iLab-20M generalize well to natural image datasets, and 2W-CNN learns better features for object recognition than AlexNet.

机译：尽管最近进展了重要的进展，但即使用于识别各种姿势，照明和背景下的单个离散物体，最佳的计算机视觉算法仍然远远落后于人类能力。在这里，我们提出了一种使用对象构成信息来改善深网络学习的新方法。虽然现有的大型数据集，例如， Imagenet，没有姿势信息，我们利用新发布的转盘数据集，ilab-20m，它在不同的灯光下具有22米的物体实例的22米图像，相机视点和转盘旋转，做更多受控的对象识别实验。我们介绍了一种新的卷积神经网络架构，其中CNN（2W-CNN），内置在线性链式前馈CNN（例如，亚历纳特），由通过对象姿势规范化的分层层增强。姿势信息仅在培训期间用作反馈信号，除了类别信息之外，还是在测试期间不需要。为了验证方法，我们使用数据集的一小部分培训2W-CNN和AlexNet，2W-CNN在类别预测中实现了6％的性能改进。我们在数学上展示了2W-CNN在随机梯度下降（SGD）优化过程下的AlexNet具有固有的优点。此外，我们通过使用普拉拉-20M上的预雷达的2W-CNN和AlexNet特征进行微调对象识别，与从头开始训练AlexNet相比，结果表现出显着的改进。此外，微调的2W-CNN功能比微调预先调整普雷折叠的亚历网功能更好。这些结果表明，ILAB-20M上的预制功能概括为自然图像数据集，2W-CNN学会了比AlexNet的对象识别更好的特征。

著录项

来源
《IEEE Conference on Applications of Computer Vision》|2017年|550-559|共10页
会议地点
作者
Jiaping Zhao; Laurent Itti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Neurons; Object recognition; Convolution; Cameras; Training; Lighting; Biological neural networks;

机译：神经元;目标识别;卷积;相机;训练;照明;生物神经网络;

相似文献

外文文献
中文文献
专利

1. Hand pose estimation in object-interaction based on deep learning for virtual reality applications [J] . Wua Min-Yu, Ting Pai-Wen, Tang Ya-Hui, Journal of visual communication & image representation . 2020,第Jula期

机译：基于深度学习对虚拟现实应用的对象交互中的手姿态估计
2. Underwater Object Detection and Pose Estimation using Deep Learning ? [J] . MyungHwan Jeon, Yeongjun Lee, Young-Sik Shin, IFAC PapersOnLine . 2019,第21期

机译：使用深度学习的水下对象检测和姿态估计？
3. Spatio-temporal object detection by deep learning: Video-interlacing to improve multi-object tracking [J] . Mhalla Ala, Chateau Thierry, Ben Amara Najoua Essoukri Image and Vision Computing . 2019,第AUGa期

机译：通过深度学习进行时空目标检测：视频交织以改善多目标跟踪
4. Improved Deep Learning of Object Category Using Pose Information [C] . Jiaping Zhao, Laurent Itti IEEE Conference on Applications of Computer Vision . 2017

机译：使用姿势信息改进对象类别的深度学习
5. Estimation of Human Poses Categories and Physical Object Properties from Motion Trajectories [D] . Fathollahi Ghezelghieh, Mona. 2017

机译：从运动轨迹估计人体姿势类别和物理对象属性
6. Creating Objects and Object Categories for Studying Perception and Perceptual Learning [O] . Karin Hauffen, Eugene Bart, Mark Brady, 2012

机译：创建用于研究知觉和知觉学习的对象和对象类别
7. Improved Deep Learning of Object Category using Pose Information [O] . Zhao, Jiaping, Itti, Laurent 2017

机译：利用姿态信息改进对象类的深度学习

Improved Deep Learning of Object Category Using Pose Information

摘要

著录项

相似文献

相关主题

期刊订阅