首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation
【24h】

Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

机译:重组器网络:学习粗到精特征聚合

获取原文

摘要

Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create features that are more robust, and typically organized as lower resolution spatial feature maps. On some tasks, such as whole-image classification, max-pooling derived features are well suited, however, for tasks requiring precise localization, such as pixel level prediction and segmentation, max-pooling destroys exactly the information required to perform well. Precise localization may be preserved by shallow convnets without pooling but at the expense of robustness. Can we have our max-pooled multilayered cake and eat it too? Several papers have proposed summation and concatenation based methods for combining upsampled coarse, abstract features with finer features to produce robust pixel level predictions. Here we introduce another model - dubbed Recombinator Networks - where coarse features inform finer features early in their formation such that finer features can make use of several layers of computation in deciding how to use coarse features. The model is trained once, end-to-end and performs better than summation-based architectures, reducing the error from the previous state of the art on two facial keypoint datasets, AFW and AFLW, by 30% and beating the current state-of-the-art on 300W without using extra data. We improve performance even further by adding a denoising prediction model based on a novel convnet formulation.
机译:具有交替卷积,最大池化和抽取层的深度神经网络已广泛用于计算机视觉的最新体系结构中。最大合并有意地丢弃精确的空间信息,以创建更健壮的特征,并且通常将其组织为较低分辨率的空间特征图。在某些任务(例如全图像分类)上,最大池化派生的功能非常适合,但是,对于需要精确定位的任务(例如像素级预测和分割),最大池化会完全破坏性能良好所需的信息。浅卷积可以保留精确的定位,而无需合并,但要以健壮性为代价。我们可以吃我们最大的多层蛋糕吗?几篇论文提出了基于求和和级联的方法,用于将上采样的粗略抽象特征与精细特征相结合,以生成鲁棒的像素级预测。在这里,我们介绍了另一个模型-称为重组器网络-粗略特征在其形成初期会通知较精细的特征,以便较精细的特征可以利用多层计算来决定如何使用粗略特征。该模型经过一次端到端训练,并且比基于求和的体系结构性能更好,从而将两个面部关键点数据集AFW和AFLW的现有技术水平的误差降低了30%,并超过了当前状态。 -300W的先进技术,无需使用额外的数据。通过添加基于新型卷积公式的降噪预测模型,我们甚至进一步提高了性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号