首页> 外文期刊>ACM transactions on multimedia computing communications and applications >User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning
【24h】

User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning

机译:通过弱监督度量学习的基于用户点击数据的细粒度图像识别

获取原文
获取原文并翻译 | 示例

摘要

We present a novel fine-grained image recognition framework using user click data, which can bridge the semantic gap in distinguishing categories that are similar in visual. As query set in click data is usually large-scale and redundant, we first propose a click-feature-based query-merging approach to merge queries with similar semantics and construct a compact click feature. Afterward, we utilize this compact click feature and convolutional neural network (CNN)-based deep visual feature to jointly represent an image. Finally, with the combined feature, we employ the metriclearning-based template-matching scheme for efficient recognition. Considering the heavy noise in the training data, we introduce a reliability variable to characterize the image reliability, and propose a weakly-supervised metric and template leaning with smooth assumption and click prior (WMTLSC) method to jointly learn the distance metric, object templates, and image reliability. Extensive experiments are conducted on a public Clickture-Dog dataset and our newly established Clickture-Bird dataset. It is shown that the click-data-based query merging helps generating a highly compact (the dimension is reduced to 0.9%) and dense click feature for images, which greatly improves the computational efficiency. Also, introducing this click feature into CNN feature further boosts the recognition accuracy. The proposed framework performs much better than previous state-of-the-arts in fine-grained recognition tasks.
机译:我们提出一种使用用户点击数据的新颖的细粒度图像识别框架,该框架可以在区分视觉相似的类别中弥合语义鸿沟。由于点击数据中的查询集通常是大规模且多余的,因此我们首先提出一种基于点击功能的查询合并方法,以合并具有相似语义的查询并构建紧凑的点击功能。之后,我们利用这种紧凑的点击功能和基于卷积神经网络(CNN)的深度视觉功能来共同表示图像。最后,结合组合特征,我们采用基于metriclearning的模板匹配方案进行有效识别。考虑到训练数据中的大量噪声,我们引入了一个可靠性变量来表征图像的可靠性,并提出了一种具有弱假设的弱监督指标和模板,并采用先验单击(WMTLSC)方法来共同学习距离指标,对象模板,和图像可靠性。在公共Clickture-Dog数据集和我们新建立的Clickture-Bird数据集上进行了广泛的实验。结果表明,基于点击数据的查询合并有助于为图像生成高度紧凑(尺寸减小至0.9%)和密集点击的功能,从而大大提高了计算效率。此外,将此点击功能引入CNN功能可进一步提高识别准确性。提出的框架在细粒度识别任务方面的性能比以前的最新技术要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号