首页> 外文期刊>Neurocomputing >Improving the BoVW via discriminative visual n-grams and MKL strategies
【24h】

Improving the BoVW via discriminative visual n-grams and MKL strategies

机译:通过区分性视觉n-gram和MKL策略改善BoVW

获取原文
获取原文并翻译 | 示例

摘要

The Bag-of-Visual-Words (BoVW) representation has been widely used to approach a number of different high-level computer vision tasks. The idea behind the BoVW representation is similar to the Bag-of-Words (BoW) used in Natural Language Processing (NLP) tasks: to extract features from the dataset, then build feature histograms that represent each instance. Although the approach is simple and effective facilitating its applicability to a wide range of problems, it inherits a well-known limitation from the traditional BoW; the disregarding of spatial information among extracted features (sequential information in text), which could be useful to capture discriminative visual-patterns. In this paper, we alleviate this limitation with the joint use of visual words and multi-directional sequences of visual words (visual n-grams). The contribution of this paper is twofold: (i) to build new simple-effective visual features inspired in the popular idea of n-gram representations in NLP and (ii) to propose the Multiple Kernel Learning (MKL) strategies to better exploit the joint use of visual words and visual n-grams in Image Classification (IC) tasks. For the former, we propose building a codebook of visual n-grams, and use them as attributes to represent images by means of the BoVW representation. For the second point, we consider the visual words and visual n-grams as different feature spaces, then we propose MKL strategies to better integrate the visual information. We evaluate our proposal in the image classification task using five different datasets: Histopathology, Birds, Butterflies, Scenes and a subset of 6 classes of CalTech-101. Experimental results show that the proposed strategies exploiting our visual n-grams, outperforms or is competitive with (i) the traditional BoVW, (ii) the BoVW using visual n-grams under traditional fusion schemes (e.g., ensemble based classifiers) and (iii) other approaches in the literature for IC that consider the spatial context. (C) 2015 Elsevier B.V. All rights reserved.
机译:视觉词袋(BoVW)表示已被广泛用于处理许多不同的高级计算机视觉任务。 BoVW表示背后的想法类似于自然语言处理(NLP)任务中使用的单词袋(BoW):从数据集中提取特征,然后构建代表每个实例的特征直方图。尽管该方法简单有效,有助于将其应用于广泛的问题,但它继承了传统BoW的一个众所周知的局限性。忽略提取的特征(文本中的顺序信息)之间的空间信息,这可能对捕获判别性视觉图案很有用。在本文中,我们通过结合使用视觉单词和视觉单词(视觉n-gram)的多方向序列来减轻这种限制。本文的贡献是双重的:(i)建立新的简单有效的视觉特征,这受到NLP中n-gram表示的流行思想的启发;(ii)提出了多核学习(MKL)策略以更好地利用联合在图像分类(IC)任务中使用视觉单词和视觉n-gram。对于前者,我们建议构建视觉n-gram的密码本,并将其用作通过BoVW表示法表示图像的属性。第二点,我们将视觉单词和视觉n元语法视为不同的特征空间,然后提出MKL策略以更好地整合视觉信息。我们使用五个不同的数据集评估了我们在图像分类任务中的建议:组织病理学,鸟类,蝴蝶,场景和6类CalTech-101的子集。实验结果表明,提出的利用我们的视觉n-gram的策略优于(i)传统BoVW,(ii)在传统融合方案(例如基于集成的分类器)下使用视觉n-gram的BoVW或与之竞争。 )文献中有关空间背景的其他方法。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号