首页> 外文期刊>IEEE Transactions on Cognitive and Developmental Systems >Canonical Correlation Analysis Regularization: An Effective Deep Multiview Learning Baseline for RGB-D Object Recognition
【24h】

Canonical Correlation Analysis Regularization: An Effective Deep Multiview Learning Baseline for RGB-D Object Recognition

机译:典型相关分析正则化:RGB-D对象识别的有效深度多视图学习基准

获取原文
获取原文并翻译 | 示例
       

摘要

Object recognition methods based on multimodal data, color plus depth (RGB-D), usually treat each modality separately in feature extraction, which neglects implicit relations between two views and preserves noise from any view to the final representation. To address these limitations, we propose a novel canonical correlation analysis (CCA)-based multiview convolutional neural network (CNNs) framework for RGB-D object representation. The RGB and depth streams process corresponding images, respectively, then are connected by CCA module leading to a common-correlated feature space. In addition, to embed CCA into deep CNNs in a supervised manner, two different schemes are explored. One considers CCA as a regularization (CCAR) term adding to the loss function. However, solving CCA optimization directly is neither computationally efficient nor compatible with the mini-batch-based stochastic optimization. Thus, we further propose an approximation method of CCAR, using the obtained CCA projection matrices to replace the weights of feature concatenation layer at regular intervals. Such a scheme enjoys benefits of full CCAR and is efficient by amortizing its cost over many training iterations. Experiments on benchmark RGB-D object recognition datasets have shown that the proposed methods outperform most existing methods using the very same of their network architectures.
机译:基于多模式数据,颜色加深度(RGB-D)的对象识别方法通常在特征提取中分别对待每种模式,这会忽略两个视图之间的隐式关系,并保留从任何视图到最终表示的噪声。为了解决这些限制,我们提出了一种新颖的基于典范相关分析(CCA)的RGB-D对象表示的多视图卷积神经网络(CNN)框架。 RGB流和深度流分别处理对应的图像,然后通过CCA模块连接到通向相关的特征空间。另外,为了以有监督的方式将CCA嵌入到深层CNN中,探索了两种不同的方案。人们认为CCA是增加损失函数的正则化(CCAR)项。但是,直接解决CCA优化既无计算效率,也与基于小批量的随机优化不兼容。因此,我们进一步提出了一种CCAR近似方法,使用获得的CCA投影矩阵以规则的间隔替换特征级联层的权重。这样的方案享有完整的CCAR的好处,并且通过在许多培训迭代中摊销其成本而非常有效。在基准RGB-D对象识别数据集上进行的实验表明,所提出的方法使用与它们相同的网络体系结构,优于大多数现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号