首页> 外文学位 >Image annotation and tag completion via kernel metric learning and noisy matrix recovery.
【24h】

Image annotation and tag completion via kernel metric learning and noisy matrix recovery.

机译:通过内核度量学习和噪声矩阵恢复实现图像注释和标签完成。

获取原文
获取原文并翻译 | 示例

摘要

In the last several years, with the ever-growing popularity of digital photography and social media, the number of images with user-provided tags has increased enormously. Due to the large amount and content versatility of these images, there is an urgent need to categorize, index, retrieve and browse these images via semantic tags (also called attributes or keywords). Following this trend, image annotation or tag completion out of missing and noisy given tags over large scale datasets has become an extremely hot topic in the interdisciplinary areas of machine learning and computer vision.;The overarching goal of this thesis is to reassess the image annotation and tag completion algorithms that mainly capture the essential relationship both between and within images and tags even when the given tag information is incomplete or noisy, so as to achieve a better performance in terms of both effectiveness and efficiency in image annotation and other tag relevant tasks including tag completion, tag ranking and tag refinement.;One of the key challenges in search-based image annotation models is to define an appropriate similarity measure (distance metric) between images, so as to assign unlabeled images with tags that are shared among similar labeled training images. Many kernel metric learning (KML) algorithms have been developed to serve as such a nonlinear distance metric. However, most of them suffer from high computational cost since the learned kernel metric needs to be projected into a positive semi-definite (PSD) cone. Besides, in image annotation tasks, existing KML algorithms require to convert image annotation tags into binary constraints, which lead to a significant semantic information loss and severely reduces the annotation performance.;In this dissertation we propose a robust kernel metric learning (RKML) algorithm based on regression technique that is able to directly utilize the image tags. RKML is computationally efficient since the PSD property is automatically ensured by the regression technique. Numeric constraints over tags are also applied to better exploit the tag information and hence improve the annotation accuracy. Further, theoretical guarantees for RKML are provided, and its efficiency and effectiveness are also verified empirically by comparing it to state-of-the-art approaches of both distance metric learning and image annotation.;Since the user-provided image tags are always incomplete and noisy, we also propose a tag completion algorithm by noisy matrix recovery (TCMR) to simultaneously enrich the missing tags and remove the noisy ones. TCMR assumes that the observed tags are independently sampled from unknown distributions that are represented by a tag matrix, and our goal is to recover that tag matrix based on the partially revealed tags which could be noisy. We provide theoretical guarantees for TCMR with recovery error bounds. In addition, a graph Laplacian based component is introduced to enforce the recovered tags to be consistent with the visual contents of images. Our empirical study with multiple benchmark datasets for image tagging shows that the proposed algorithm outperforms state-of-the-art approaches in terms of both effectiveness and efficiency when handling missing and noisy tags.
机译:在过去的几年中,随着数字摄影和社交媒体的日益普及,带有用户提供标签的图像数量已大大增加。由于这些图像的数量大和内容的多功能性,迫切需要通过语义标签(也称为属性或关键字)对这些图像进行分类,索引,检索和浏览。顺应这一趋势,在大型数据集上缺少标注和嘈杂给定标签的图像标注或标签补全已成为机器学习和计算机视觉交叉学科领域中的一个非常热门的话题。本论文的总体目标是重新评估图像标注和标签完成算法,即使在给定的标签信息不完整或嘈杂的情况下,也主要捕获图像和标签之间以及内部的基本关系,从而在图像注释和其他与标签相关的任务的有效性和效率上都达到更好的性能基于搜索的图像批注模型的主要挑战之一是在图像之间定义适当的相似性度量(距离度量),以便为未标记的图像分配在相似图像之间共享的标签标记的训练图像。已经开发了许多内核度量学习(KML)算法来用作这种非线性距离度量。但是,由于需要将学习的内核度量标准投影到正半定(PSD)锥中,因此它们中的大多数都会遭受较高的计算成本。此外,在图像标注任务中,现有的KML算法要求将图像标注标签转换为二进制约束,这会导致语义信息的大量丢失,并严重降低标注的性能。本文提出了一种鲁棒的内核度量学习算法(RKML)。基于能够直接利用图像标签的回归技术。 RKML的计算效率很高,因为通过回归技术可以自动确保PSD属性。标签上的数字约束也可用于更好地利用标签信息,从而提高注释准确性。此外,它为RKML提供了理论上的保证,并且通过将其与距离度量学习和图像标注的最新方法进行比较,也对RKML的效率和有效性进行了实证检验。由于用户提供的图像标签始终不完整对于有噪声的标签,我们还提出了一种通过有噪声矩阵恢复(TCMR)的标签完成算法,以同时丰富丢失的标签并去除有噪声的标签。 TCMR假定观察到的标签是从由标签矩阵表示的未知分布中独立采样的,我们的目标是基于部分暴露的标签(可能有噪声)来恢复该标签矩阵。我们为TCMR提供恢复误差范围的理论保证。另外,引入了基于拉普拉斯图形的组件来强制恢复的标签与图像的视觉内容一致。我们使用多个基准数据集进行图像标记的经验研究表明,在处理缺失和嘈杂的标记时,该算法在有效性和效率方面均优于最新方法。

著录项

  • 作者

    Feng, Zheyun.;

  • 作者单位

    Michigan State University.;

  • 授予单位 Michigan State University.;
  • 学科 Computer science.;Computer engineering.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 159 p.
  • 总页数 159
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号