首页> 外文期刊>IEEE Transactions on Consumer Electronics >End-to-End 6DoF Pose Estimation From Monocular RGB Images
【24h】

End-to-End 6DoF Pose Estimation From Monocular RGB Images

机译:单眼RGB图像的端到端6dof姿势估计

获取原文
获取原文并翻译 | 示例

摘要

We present a conceptually simple framework for 6DoF object pose estimation, especially for autonomous driving scenarios. Our approach can efficiently detect the traffic participants from a monocular RGB image while simultaneously regressing their 3D translation and rotation vectors. The proposed method 6D-VNet, extends the Mask R-CNN by adding customised heads for predicting vehicle's finer class, rotation and translation. It is trained end-to-end compared to previous methods. Furthermore, we show that the inclusion of translational regression in the joint losses is crucial for the 6DoF pose estimation task, where object translation distance along longitudinal axis varies significantly, e.g., in autonomous driving scenarios. Additionally, we incorporate the mutual information between traffic participants via a modified non-local block to capture the spatial dependencies among the detected objects. As opposed to the original non-local block implementation, the proposed weighting modification takes the spatial neighbouring information into consideration whilst counteracting the effect of extreme gradient values. We evaluate our method on the challenging real-world Pascal3D+ dataset and our 6D-VNet reaches the 1st place in ApolloScape challenge 3D Car Instance task (Apolloscape, 2018), (Huang et al., 2018).
机译:我们为6dof对象姿势估算提供了一个概念上简单的框架,特别是对于自动驾驶场景。我们的方法可以有效地检测来自单眼RGB图像的交通参与者,同时回归其3D翻译和旋转向量。所提出的方法6D-VNET通过添加用于预测车辆更精细的类,旋转和翻译的定制头来扩展掩模R-CNN。与以前的方法相比,它训练结束于结束。此外,我们表明,在共同损失中包含翻译回归对于6dof姿势估计任务至关重要,其中沿纵向轴的对象平移距离显着变化,例如自动驾驶场景。另外,我们通过修改的非本局块在交通参与者之间融合了相互信息,以捕获检测到的对象之间的空间依赖性。与原始非本地块实现相反,所提出的加权修改考虑到空间相邻信息,同时抵消极端梯度值的效果。我们评估了我们在挑战的真实世界Pascal3d + DataSet上的方法,我们的6D-VNet到达Apolloscape挑战3D汽车实例任务(Apolloscape,2018),(Huang等,2018)。

著录项

  • 来源
    《IEEE Transactions on Consumer Electronics》 |2021年第1期|87-96|共10页
  • 作者单位

    Shenzhen Univ Inst Artificial Intelligence & Adv Commun Coll Elect & Informat Engn Guangdong Key Lab Intelligent Informat Proc Shenz Shenzhen 518060 Peoples R China;

    PingAn Insurance Grp Co China Ltd Visual Comp Grp Shenzhen 518040 Peoples R China;

    Shenzhen Univ Inst Artificial Intelligence & Adv Commun Coll Elect & Informat Engn Guangdong Key Lab Intelligent Informat Proc Shenz Shenzhen 518060 Peoples R China;

    Shenzhen Univ Inst Artificial Intelligence & Adv Commun Coll Elect & Informat Engn Guangdong Key Lab Intelligent Informat Proc Shenz Shenzhen 518060 Peoples R China;

    Shenzhen Univ Inst Artificial Intelligence & Adv Commun Coll Elect & Informat Engn Guangdong Key Lab Intelligent Informat Proc Shenz Shenzhen 518060 Peoples R China;

    Natl Inst Appl Sci Rennes INSA Rennes Dept EII INSA Rennes F-35700 Rennes France|Inst Elect & Technol NumeR UMR CNRS 6164 F-35000 Rennes France;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Three-dimensional displays; Pose estimation; Object detection; Head; Cameras; Two dimensional displays; Autonomous vehicles; End-to-end; 6DoF; pose estimation; translation regression; autonomous driving;

    机译:三维显示器;姿势估计;对象检测;头部;主机;二维显示器;自动车辆;端到端;6dof;姿势估计;翻译回归;自主驾驶;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号