首页> 外文期刊>Signal, Image and Video Processing >Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification - Springer
【24h】

Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification - Springer

机译:在空间金字塔布局上融合基于视觉词汇的集成视觉单词袋和加权色彩矩以进行自然场景图像分类-Springer

获取原文
获取原文并翻译 | 示例

摘要

The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model.
机译:视觉单词袋(BOW)模型是一种有效的图像表示技术,用于图像分类和注释任务。从自动提取的图像特征向量中构建良好的视觉词汇,会产生可区分的视觉单词,从而可以提高图像分类任务的准确性。大多数使用BOW模型对图像进行分类的方法都忽略了可以从图像类获得的有用信息,以建立视觉词汇。此外,大多数BOW模型使用从局部区域提取的强度特征,而忽略颜色信息,这是任何自然场景图像的重要特征。在本文中,我们表明整合从每个图像类别生成的视觉词汇可以改善BOW图像的表示方式,并提高自然场景图像分类的准确性。我们使用基于关键点密度的加权方法将BOW表示与空间金字塔布局上的图像颜色信息结合在一起。此外,我们表明从一个场景图像数据集的训练图像生成的视觉词汇可以合理地表示同一域上的另一个场景图像数据集。这有助于减少构建新视觉词汇所需的时间和精力。使用10倍交叉验证,对三个分别具有6、8和15个场景类别的著名场景分类数据集评估了所提出的方法。使用带有直方图相交核的支持向量机的实验结果表明,所提出的方法优于基线方法,例如Gist特征,rgbSIFT特征和BOW模型的不同配置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号