首页> 外文期刊>IEEE transactions on industrial informatics >CMIB: Unsupervised Image Object Categorization in Multiple Visual Contexts
【24h】

CMIB: Unsupervised Image Object Categorization in Multiple Visual Contexts

机译:CMIB:多种视觉上下文中的无监督图像对象分类

获取原文
获取原文并翻译 | 示例
       

摘要

Object categorization in images is fundamental to various industrial areas, such as automated visual inspection, fast image retrieval, and intelligent surveillance. Most existing methods treat visual features (e.g., scale-invariant feature transform) as content information of the objects, while regarding image tags as their contextual information. However, the image tags can hardly be acquired in completely unsupervised settings, especially when the image volume is too large to be marked. In this article, we propose a novel contextual multivariate information bottleneck (CMIB) method to conduct unsupervised image object categorization in multiple visual contexts. Unlike using manual contexts, the CMIB method first automatically generates a set of high-level basic clusterings by multiple global features, which are unprecedentedly defined as visual contexts since they can provide overall information about the target images. Then, the idea of the data compression procedure for object category discovery is proposed, in which the content and multiple visual contexts are maximally preserved through a "bottleneck." Specifically, two Bayesian networks are initially built to characterize the relationship between data compression and information preservation. Finally, a novel sequential information-theoretic optimization is proposed to ensure the convergence of the CMIB objective function. Experimental results on seven real-world benchmark image datasets demonstrate that the CMIB method achieves better performance than the state-of-the-art baselines.
机译:图像中的对象分类是各种工业领域的基础,如自动视觉检查,快速图像检索和智能监测。大多数现有方法将视觉特征(例如,鳞片不变特征变换)视为对象的内容信息,同时将图像标签作为其上下文信息。然而,几乎不能在完全无监督的设置中获取图像标签,尤其是当图像体积太大而无法标记时。在本文中,我们提出了一种新颖的上下文多元信息瓶颈(CMIB)方法,以在多个视觉上下文中进行无监督的图像对象分类。与使用手动上下文不同,CMIB方法首先通过多个全局功能自动生成一组高级基本集群,这是前所未有地定义为可视上下文的特征,因为它们可以提供有关目标图像的整体信息。然后,提出了对对象类别发现的数据压缩过程的思想,其中内容和多个视觉上下文通过“瓶颈”是最大保留的。具体而言,最初建立了两个贝叶斯网络以表征数据压缩与信息保存之间的关系。最后,提出了一种新的顺序信息 - 理论优化,以确保CMIB目标函数的收敛性。七个实际基准图像数据集上的实验结果表明CMIB方法比最先进的基线实现了更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号