首页> 外文会议>IEEE International Symposium on Multimedia >Comprehensive Study of Multiple CNNs Fusion for Fine-Grained Dog Breed Categorization
【24h】

Comprehensive Study of Multiple CNNs Fusion for Fine-Grained Dog Breed Categorization

机译:多种CNN融合用于细粒犬品种分类的综合研究

获取原文

摘要

Fine-grained visual categorization aims to distinguish objects in subordinate classes instead of basic class, and is a challenge visual task due to the high correlation between subordinated classes and large intra-class variation (e.g. different object poses). Although, deep convolutional neural network (DCNN) has brought dramatic success on generic object classification, detection and segmentation with the availability of the large-scale training samples, direct application of DCNN on fine-grained visual categorization, where only decades or at most hundreds of training samples for each subordinate class are available in most public finegrained image datasets, cannot lead to satisfactory classification results due to small number of training samples. This study explores the transfer learning strategy for finegrained dog breed categorization based on the learned CNN models with the large-scale image dataset: ImageNet, and prove promising performance with two DCNN models: AlexNet and VGG-16. Furthermore, we argue that different DCNN architecture may extract the representation of different image aspects due to the previously defined CNN kernel sizes, number and various operations in the model learning procedure, and thus result in different performance for visual categorization. This study proposes to fusion multiple CNN architectures for combining different aspect representations to give more accurate performance. We compressively study the fusion of different layers such as Fc6 and Fc7 in AlexNet and VGG-16, and manifest 2.88% improvement of the fusion architecture over the best performance of the only one DCNN model: VGG-16 from 81.2% to 84.08%.
机译:细粒度的视觉分类旨在区分下属类而不是基础类中的对象,并且由于下属类和类内差异较大(例如,不同的对象姿势)之间的高度相关性,因此是一项具有挑战性的视觉任务。尽管深卷积神经网络(DCNN)在大规模训练样本的可用性,通用对象分类,检测和分割,将DCNN直接应用于细粒度视觉分类(仅几十年或最多数百年)方面取得了巨大的成功大多数公共细粒度图像数据集中都提供了每个从属类别的训练样本的数量,由于训练样本数量少,无法导致令人满意的分类结果。这项研究基于具有大型图像数据集ImageNet的习得的CNN模型,探索了细粒犬种分类的转移学习策略,并通过AlexNet和VGG-16这两种DCNN模型证明了其有希望的性能。此外,我们认为,由于模型学习过程中先前定义的CNN内核大小,数量和各种操作,不同的DCNN体系结构可能会提取不同图像方面的表示,从而导致视觉分类的性能不同。这项研究建议融合多个CNN架构,以组合不同的方面表示形式,以提供更准确的性能。我们以压缩方式研究了AlexNet和VGG-16中的不同层(例如Fc6和Fc7)的融合,并证明融合体系结构比唯一一个DCNN模型的最佳性能提高了2.88%,VGG-16从81.2%增至84.08%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号