首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions
【24h】

Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions

机译:通过最小化全局损失函数,使用深度连体和三重卷积网络学习局部图像描述符

获取原文

摘要

Recent innovations in training deep convolutional neural network (ConvNet) models have motivated the design of new methods to automatically learn local image descriptors. The latest deep ConvNets proposed for this task consist of a siamese network that is trained by penalising misclassification of pairs of local image patches. Current results from machine learning show that replacing this siamese by a triplet network can improve the classification accuracy in several problems, but this has yet to be demonstrated for local image descriptor learning. Moreover, current siamese and triplet networks have been trained with stochastic gradient descent that computes the gradient from individual pairs or triplets of local image patches, which can make them prone to overfitting. In this paper, we first propose the use of triplet networks for the problem of local image descriptor learning. Furthermore, we also propose the use of a global loss that minimises the overall classification error in the training set, which can improve the generalisation capability of the model. Using the UBC benchmark dataset for comparing local image descriptors, we show that the triplet network produces a more accurate embedding than the siamese network in terms of the UBC dataset errors. Moreover, we also demonstrate that a combination of the triplet and global losses produces the best embedding in the field, using this triplet network. Finally, we also show that the use of the central-surround siamese network trained with the global loss produces the best result of the field on the UBC dataset.
机译:训练深度卷积神经网络(ConvNet)模型的最新创新激励了自动学习局部图像描述符的新方法的设计。为此任务建议的最新深层ConvNets包含一个暹罗网络,该网络通过惩罚对局部图像补丁对的错误分类进行训练。机器学习的最新结果表明,用三重态网络代替该暹罗可以提高一些问题中的分类精度,但这尚未在本地图像描述符学习中得到证明。此外,目前的暹罗网络和三胞胎网络已通过随机梯度下降训练,该梯度下降计算是从局部图像块的单个对或三重峰计算梯度,这可能使它们易于过度拟合。在本文中,我们首先提出将三重态网络用于局部图像描述符学习的问题。此外,我们还建议使用全局损失,该损失可将训练集中的总体分类误差降至最低,从而可以提高模型的泛化能力。使用UBC基准数据集比较本地图像描述符,我们显示出在UBC数据集错误方面,三元组网络比暹罗网络产生更准确的嵌入。此外,我们还证明,使用此三元组网络,三元组和全局损失的组合可在现场产生最佳的嵌入效果。最后,我们还表明,使用经过全局损失训练的中央-环绕暹罗网络可以在UBC数据集上产生最佳的野外效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号