首页> 外文会议>IEEE International Symposium on Multimedia >Comprehensive Study of Multiple CNNs Fusion for Fine-Grained Dog Breed Categorization
【24h】

Comprehensive Study of Multiple CNNs Fusion for Fine-Grained Dog Breed Categorization

机译:微粒犬分类多重CNNS融合的综合研究

获取原文

摘要

Fine-grained visual categorization aims to distinguish objects in subordinate classes instead of basic class, and is a challenge visual task due to the high correlation between subordinated classes and large intra-class variation (e.g. different object poses). Although, deep convolutional neural network (DCNN) has brought dramatic success on generic object classification, detection and segmentation with the availability of the large-scale training samples, direct application of DCNN on fine-grained visual categorization, where only decades or at most hundreds of training samples for each subordinate class are available in most public finegrained image datasets, cannot lead to satisfactory classification results due to small number of training samples. This study explores the transfer learning strategy for finegrained dog breed categorization based on the learned CNN models with the large-scale image dataset: ImageNet, and prove promising performance with two DCNN models: AlexNet and VGG-16. Furthermore, we argue that different DCNN architecture may extract the representation of different image aspects due to the previously defined CNN kernel sizes, number and various operations in the model learning procedure, and thus result in different performance for visual categorization. This study proposes to fusion multiple CNN architectures for combining different aspect representations to give more accurate performance. We compressively study the fusion of different layers such as Fc6 and Fc7 in AlexNet and VGG-16, and manifest 2.88% improvement of the fusion architecture over the best performance of the only one DCNN model: VGG-16 from 81.2% to 84.08%.
机译:细粒度的视觉分类旨在区分从属类而不是基本类的对象,并且是由于次级类别和大型类内变型之间的高相关(例如不同对象姿势)的挑战视觉任务。虽然,深度卷积神经网络(DCNN)对通用对象分类,检测和分割带来了戏剧性的成功,但随着大规模训练样本的可用性,直接应用DCNN在细粒度的视觉分类上,只有数十年或大多数数百由于少量训练样本,每个下级类别的训练样本都可以在大多数公共Finegreat的图像数据集中获得,因此不能导致令人满意的分类结果。本研究探讨了基于具有大规模图像数据集的学习CNN模型的FineGreat狗品种分类的转移学习策略:Imagenet,并证明了两个DCNN模型的有希望的性能:AlexNet和VGG-16。此外,由于在模型学习过程中先前定义的CNN内核大小,数量和各种操作,不同的DCNN架构可能提取不同的DCNN架构的表示,从而导致不同的性能以进行视觉分类。本研究建议融合多个CNN架构以结合不同的方面表示来提供更准确的性能。我们在亚历谢和VGG-16中压缩了不同层的融合,如FC6和FC7,并在唯一一个DCNN型号的最佳性能下表现出融合架构的2.88%改善:VGG-16从81.2%到84.08%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号