Classifiers trained on disjointed classes with few labelled data points are used in one-shot learning to identify visual concepts from other classes. Recently, Siamese networks and similarity layers have been used to solve the one-shot learning problem, achieving state-of-the-art performance on visual-character recognition datasets. Various techniques have been developed over the years to improve the performance of these networks on fine-grained image classification datasets. They focused primarily on improving the loss and activation functions, augmenting visual features, employing multiscale metric learning, and pre-training and fine-tuning the backbone network. We investigate similarity layers for one-shot learning tasks and propose two frameworks for combining these layers into a MergedNet network. On all four datasets used in our experiment, MergedNet outperformed the baselines based on classification accuracy, and it generalises to other datasets when trained on miniImageNet.Crown Copyright (c) 2022 Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
展开▼