首页> 外文会议>IEEE International Conference on Computer Vision >Cutting Edge: Soft Correspondences in Multimodal Scene Parsing
【24h】

Cutting Edge: Soft Correspondences in Multimodal Scene Parsing

机译:切削刃:多模式场景解析中的软对应关系

获取原文
获取外文期刊封面目录资料

摘要

Exploiting multiple modalities for semantic scene parsing has been shown to improve accuracy over the single-modality scenario. Existing methods, however, assume that corresponding regions in two modalities have the same label. In this paper, we address the problem of data misalignment and label inconsistencies, e.g., due to moving objects, in semantic labeling, which violate the assumption of existing techniques. To this end, we formulate multimodal semantic labeling as inference in a CRF, and introduce latent nodes to explicitly model inconsistencies between two domains. These latent nodes allow us not only to leverage information from both domains to improve their labeling, but also to cut the edges between inconsistent regions. To eliminate the need for hand tuning the parameters of our model, we propose to learn intra-domain and inter-domain potential functions from training data. We demonstrate the benefits of our approach on two publicly available datasets containing 2D imagery and 3D point clouds. Thanks to our latent nodes and our learning strategy, our method outperforms the state-of-the-art in both cases.
机译:已经显示出利用多种方式进行语义场景解析,以提高单模场景的准确性。然而,现有方法假设两个模态中的相应区域具有相同的标签。在本文中,我们解决了数据未对准和标签不一致的问题,例如,由于移动物体,在语义标记中,违反现有技术的假设。为此,我们将多式数语义标记标志为CRF中的推断,并引入潜在节点以显式模拟两个域之间的不一致性。这些潜在节点允许我们不仅可以利用来自两个域的信息来改善其标签,而且还可以在不一致的区域之间切割边缘。为了消除手部调节我们模型的参数,我们建议学习域内和域间潜在功能训练数据。我们展示了我们在包含2D图像和3D点云的两个公开的数据集中的方法的好处。由于我们的潜在节点和我们的学习策略,我们的方法在这两种情况下都优于最先进的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号