首页> 外文期刊>Image Processing, IEEE Transactions on >Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models
【24h】

Generative Model With Coordinate Metric Learning for Object Recognition Based on 3D Models

机译:基于3D模型的坐标度量学习生成模型用于对象识别

获取原文
获取原文并翻译 | 示例
       

摘要

One of the bottlenecks in acquiring a perfect database for deep learning is the tedious process of collecting and labeling data. In this paper, we propose a generative model trained with synthetic images rendered from 3D models which can reduce the burden on collecting real training data and make the background conditions more realistic. Our architecture is composed of two sub-networks: a semantic foreground object reconstruction network based on Bayesian inference and a classification network based on multi-triplet cost training for avoiding overfitting on the monotone synthetic object surface and utilizing accurate information of synthetic images like object poses and lighting conditions which are helpful for recognizing regular photos. First, our generative model with metric learning utilizes additional foreground object channels generated from semantic foreground object reconstruction sub-network for recognizing the original input images. Multi-triplet cost function based on poses is used for metric learning which makes it possible to train an effective categorical classifier purely based on synthetic data. Second, we design a coordinate training strategy with the help of adaptive noise applied on the inputs of both of the concatenated sub-networks to make them benefit from each other and avoid inharmonious parameter tuning due to different convergence speeds of two sub-networks. Our architecture achieves the state-of-the-art accuracy of 50.5% on the ShapeNet database with data migration obstacle from synthetic images to real images. This pipeline makes it applicable to do recognition on real images only based on 3D models. Our codes are available atnhttps://github.com/wangyida/gm-cmln.
机译:获得完善的深度学习数据库的瓶颈之一是收集和标记数据的繁琐过程。在本文中,我们提出了一种生成模型,该模型使用从3D模型渲染的合成图像进行训练,可以减轻收集实际训练数据的负担,并使背景条件更加真实。我们的架构由两个子网组成:一个基于贝叶斯推理的语义前景对象重建网络和一个基于多三元组成本训练的分类网络,可避免在单调合成对象表面上过度拟合并利用合成图像(如对象姿势)的准确信息和照明条件,有助于识别常规照片。首先,我们的带有度量学习的生成模型利用从语义前景对象重建子网络生成的其他前景对象通道来识别原始输入图像。基于姿势的多三元组成本函数用于度量学习,这使得仅基于合成数据就可以训练有效的分类器。其次,我们在两个级联子网的输入端应用自适应噪声的帮助下,设计了一种协调训练策略,以使它们彼此受益,并避免由于两个子网的收敛速度不同而导致的参数调整不协调。我们的体系结构在ShapeNet数据库上达到了50.5%的最新精度,并且存在从合成图像到真实图像的数据迁移障碍。该流水线使其仅适用于基于3D模型的真实图像识别。我们的代码可在以下网址获得:n https: //github.com/wangyida/gm-cmln。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号