首页> 外文期刊>Multimedia Tools and Applications >Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets
【24h】

Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets

机译:使用深度特征的袋 - 视觉单词码本生成,用于有效分类的不平衡多级图像数据集

获取原文
获取原文并翻译 | 示例
           

摘要

Classification of imbalanced multi-class image datasets is a challenging problem in computer vision. Most of the real-world datasets are imbalanced in nature because of the uneven distribution of the samples in each class. The problem with an imbalanced dataset is that the minority class having a smaller number of instance samples is left undetected. Most of the traditional machine learning algorithms can detect the majority class efficiently but lag behind in the efficient detection of the minority class, which ultimately degrades the overall performance of the classification model. In this paper, we have proposed a novel combination of visual codebook generation using deep features with the non-linear Chi(2) SVM classifier to tackle the imbalance problem that arises while dealing with multi-class image datasets. The low-level deep features are first extracted by transfer learning using the ResNet-50 pre-trained network, and clustered using k-means. The center of each cluster is a visual word in the codebook. Each image is then translated into a set of features called the Bag-of-Visual-Words (BOVW) derived from the histogram of visual words in the vocabulary. The non-linear Chi(2) SVM classifier is found most optimal for classifying the ensuing features, as proved by a detailed empirical analysis. Hence with the right combination of learning tools, we are able to tackle classification of multi-class imbalanced image datasets in an effective manner. This is proved from the higher scores of accuracy, F1-score and AUC metrics in our experiments on two challenging multi-class datasets: Graz-02 and TF-Flowers, as compared to the state-of-the-art methods.
机译:不平衡多类图像数据集的分类是计算机视觉中有挑战性的问题。由于每个类别中样本的分布不均匀,大多数实际数据集本质上是不平衡的。不平衡数据集的问题是留下较少数量的实例样本的少数类。大多数传统的机器学习算法可以有效地检测多数阶级,但在高效检测的情况下滞后,最终降低了分类模型的整体性能。在本文中,我们提出了使用具有非线性CHI(2)SVM分类器的深度特征的视觉码本生成的新组合,以解决在处理多级图像数据集时出现的不平衡问题。首先通过使用Reset-50预训练的网络传输学习来提取低级深度特征,并使用K-Meanse群集。每个群集的中心是码本中的视觉单词。然后将每个图像转换为称为源自词汇表中的视觉单词的直方图的袋的视觉词(BOVW)的一组特征。由于详细的经验分析,发现非线性CHI(2)SVM分类器被发现最佳,用于分类随后的特征。因此,具有学习工具的合适组合,我们能够以有效的方式解决多级不平衡图像数据集的分类。与我们在两个具有挑战性的多级数据集中的实验中的准确度,F1分数和AUC度量的较高分数中证明了这一点,而Graz-02和TF花样,则与最先进的方法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号