首页> 外文期刊>Circuits and Systems for Video Technology, IEEE Transactions on >Contextual Bag-of-Words for Visual Categorization
【24h】

Contextual Bag-of-Words for Visual Categorization

机译:用于视觉分类的语境词袋

获取原文
获取原文并翻译 | 示例

摘要

Bag-of-words (BOW), which represents an image by the histogram of local patches on the basis of a visual vocabulary, has attracted intensive attention in visual categorization due to its good performance and flexibility. Conventional BOW neglects the contextual relations between local patches due to its Naïve Bayesian assumption. However, it is well known that contextual relations play an important role for human beings to recognize visual categories from their local appearance. This paper proposes a novel contextual bag-of-words (CBOW) representation to model two kinds of typical contextual relations between local patches, i.e., a semantic conceptual relation and a spatial neighboring relation. To model the semantic conceptual relation, visual words are grouped on multiple semantic levels according to the similarity of class distribution induced by them, accordingly local patches are encoded and images are represented. To explore the spatial neighboring relation, an automatic term extraction technique is adopted to measure the confidence that neighboring visual words are relevant. Word groups with high relevance are used and their statistics are incorporated into the BOW representation. Classification is taken using the support vector machine with an efficient kernel to incorporate the relational information. The proposed approach is extensively evaluated on two kinds of visual categorization tasks, i.e., video event and scene categorization. Experimental results demonstrate the importance of contextual relations of local patches and the CBOW shows superior performance to conventional BOW.
机译:词袋(BOW)表示基于视觉词汇的局部补丁直方图的图像,由于其良好的性能和灵活性,在视觉分类中引起了广泛的关注。传统的BOW由于其朴素的贝叶斯假设而忽略了局部斑块之间的上下文关系。但是,众所周知,上下文关系对于人类从其局部外观中识别视觉类别起着重要的作用。本文提出了一种新颖的语境词袋(CBOW)表示法,以对局部补丁之间的两种典型语境关系进行建模,即语义概念关系和空间邻近关系。为了对语义概念关系建模,根据视觉单词引起的类分布的相似性,将视觉单词分为多个语义级别,从而对局部补丁进行编码并表示图像。为了探索空间相邻关系,采用自动术语提取技术来测量相邻视觉单词相关的置信度。使用具有高度相关性的词组,并将其统计信息合并到BOW表示中。使用具有有效内核的支持向量机进行分类以合并关系信息。在两种视觉分类任务上,即视频事件和场景分类,对该方法进行了广泛的评估。实验结果证明了局部补丁的上下文关系的重要性,并且CBOW显示出优于常规BOW的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号