首页> 外文学位 >Visual object recognition using generative models of images .
【24h】

Visual object recognition using generative models of images .

机译:利用图像生成模型进行视觉目标识别。

获取原文
获取原文并翻译 | 示例

摘要

Visual object recognition is one of the key human capabilities that we would like machines to have. The problem is the following: given an image of an object (e.g. someone's face), predict its label (e.g. that person's name) from a set of possible object labels. The predominant approach to solving the recognition problem has been to learn a discriminative model, i.e. a model of the conditional probability P(l|upsilon) over possible object labels l given an image upsilon.;We explore four types of applications of generative/reconstructive models for recognition: 1) incorporating complex domain knowledge into the learning by inverting a synthesis model, 2) using the latent image representations of generative/reconstructive models for recognition, 3) optimizing a hybrid generative-discriminative loss function, and 4) creating additional synthetic data for training more accurate discriminative models. Taken together, the results for these applications support the idea that generative/reconstructive models and unsupervised learning have a key role to play in building object recognition systems.;Here we consider an alternative class of models, broadly referred to as generative models, that learns the latent structure of the image so as to explain how it was generated. This is in contrast to discriminative models, which dedicate their parameters exclusively to representing the conditional distribution P(l|upsilon). Making finer distinctions among generative models, we consider a supervised generative model of the joint distribution P(upsilon, l) over image-label pairs, an unsupervised generative model of the distribution P(upsilon) over images alone, and an unsupervised reconstructive model, which includes models such as autoencoders that can reconstruct a given image, but do not define a proper distribution over images. The goal of this thesis is to empirically demonstrate various ways of using these models for object recognition. Its main conclusion is that such models are not only useful for recognition, but can even outperform purely discriminative models on difficult recognition tasks.
机译:视觉对象识别是我们希望机器具备的关键人类功能之一。问题如下:给定对象(例如某人的脸)的图像,从一组可能的对象标签中预测其标签(例如该人的名字)。解决识别问题的主要方法是学习判别模型,即在给定图像upsilon的情况下,可能物体标签上的条件概率P(l | upsilon)的模型。;我们探讨了生成/重构的四种类型的应用识别模型:1)通过反转合成模型将复杂领域知识整合到学习中; 2)使用生成/重构模型的潜像表示进行识别; 3)优化混合​​生成-区分损失函数; 4)创建附加模型综合数据,以训练更准确的判别模型。综上所述,这些应用程序的结果支持这样的想法,即生成/重构模型和无监督学习在构建对象识别系统中起着关键作用。;在此,我们考虑另一类可广泛学习的模型,即生成模型图像的潜在结构,以解释其生成方式。这与判别模型相反,判别模型专门将其参数专用于表示条件分布P(l | upsilon)。为了使生成模型之间有更好的区别,我们考虑图像标签对上的联合分布P(upsilon,l)的监督生成模型,仅图像上的分布P(upsilon)的无监督生成模型,以及无监督的重建模型,其中包括可以重建给定图像但未在图像上定义适当分布的模型,例如自动编码器。本文的目的是通过经验证明使用这些模型进行对象识别的各种方法。其主要结论是,这样的模型不仅对识别有用,而且甚至可以在困难的识别任务上胜过纯判别模型。

著录项

  • 作者

    Nair, Vinod.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号