MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation

Jiang J.; He Z.; Zhao X.Zhang S.Wang Y.Wu C.

首页> 外文期刊>Neurocomputing >MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation

【24h】

MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation

机译：MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

? 2022 Elsevier B.V.This paper addresses the challenge of 6DoF texture-less object pose estimation from a single RGB image. Many recent works have shown that two-stage deep learning approaches based on the fusion of 2D geometric intermediate representations achieve remarkable results. These methods implicitly explore the mapping from the 2D appearance domain to the 3D structure domain. However, due to the lack of 3D geometric constraints from depth maps, it is difficult to extract enough clues based on appearance features to master the geometric relation of projection from 3D viewpoints to 2D planes, and this estimation process is extremely sensitive to occlusion. We propose a novel network called MLFNet that lifts the feature space from 2D to 3D based on hybrid 3D geometric intermediate representations. For the first time, we propose the surface normals in the object coordinate system as an intermediate representation of pose; its violent change provides strong clues for the keypoints usually located at the abrupt change of object surface. Dense 3D surfaces can enhance the geometric consistency of multi-representation constraints and retain more information in occluded scenes. With the proposed multi-modality dual attention mechanism and the embedding of standard 3D shape knowledge, the 2D geometric representation learning process explicitly depends on the fusion of 2D appearance features and 3D geometric features. This standardized information fusion pattern among 2D intermediate representations, 3D intermediate representations, and CAD models prior significantly reduces the network learning space. The proposed method achieves competitive performance on the Linemod dataset and outperforms the state-of-the-art methods on the Occlusion Linemod and T-Less datasets, which demonstrates the feasibility of the pose multi-representation fusion technique. The project site is at https://github.com/JJJano/MLFNet.

著录项

来源
《Neurocomputing》 |2022年第14期|16-29|共14页
作者
Jiang J.; He Z.; Zhao X.Zhang S.Wang Y.Wu C.;
展开▼
作者单位

The State Key Lab of Fluid Power & Mechatronic Systems Zhejiang University;

Department of Mechanical Engineering University of Shanghai for Science and Technology;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种英语
中图分类
关键词
6DoF pose estimation; Hybrid pose representation; Monocular lifting fusion; Multi-modality dual attention;

MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation

摘要

著录项

相关主题

期刊订阅