首页> 外文学位 >Joint Training of a Neural Network and a Structured Model for Computer Vision.
【24h】

Joint Training of a Neural Network and a Structured Model for Computer Vision.

机译:神经网络和计算机视觉结构化模型的联合培训。

获取原文
获取原文并翻译 | 示例

摘要

Identifying objects and telling where they are in real world images is one of the most important problems in Artificial Intelligence. The problem is challenging due to: occluded objects, varying object viewpoints and object deformations. This makes the vision problem extremely difficult and cannot be efficiently solved without learning.;This thesis explores hybrid systems that combine a neural network as a trainable feature extractor and structured models that capture high level information such as object parts. The resulting models combine the strengths of the two approaches: a deep neural network which provides a powerful non-linear feature transformation and a high level structured model which integrates domain-specific knowledge. We develop discriminative training algorithms to jointly optimize these entire models end-to-end.;First, we proposed a unified model which combines a deep neural network with a latent topic model for image classification. The hybrid model is shown to outperform models based solely on neural networks or topic model alone. Next, we investigate techniques for training a neural network system, introducing an effective way of regularizing the network called DropConnect. DropConnect allows us to train large models while avoiding over-fitting. This yields state-of-the-art results on a variety of standard benchmarks for image classification. Third, we worked on object detection for PASCAL challenge. We improved the deformable parts model and proposed a new non-maximal suppression algorithm. This system was the joint winner of the 2011 challenge. Finally, we develop a new hybrid model which integrates a deep network, deformable parts model and non-maximal suppression. Joint training of our hybrid model shows clear advantage over train each component individually, and achieving competitive result on standard benchmarks.
机译:识别对象并告诉它们在现实世界中的位置是人工智能中最重要的问题之一。由于以下原因,该问题具有挑战性:物体被遮挡,物体视点变化和物体变形。这使视觉问题变得极为困难,无法学习就无法有效解决。;本文探索了混合系统,该系统结合了作为可训练特征提取器的神经网络和捕获高级信息(例如对象零件)的结构化模型。生成的模型结合了这两种方法的优势:提供强大的非线性特征转换的深度神经网络和集成了特定领域知识的高级结构化模型。我们开发了判别式训练算法,以端到端地共同优化整个模型。首先,我们提出了一个统一的模型,该模型将深度神经网络与潜在主题模型相结合进行图像分类。混合模型的性能优于仅基于神经网络或仅基于主题模型的模型。接下来,我们研究训练神经网络系统的技术,介绍一种有效的方法来规范化称为DropConnect的网络。 DropConnect使我们能够训练大型模型,同时避免过度拟合。这样就可以在各种图像分类标准基准上获得最新的结果。第三,我们致力于PASCAL挑战的目标检测。我们改进了可变形零件模型,并提出了一种新的非最大抑制算法。该系统是2011年挑战赛的联合冠军。最后,我们开发了一个新的混合模型,该模型集成了深层网络,可变形零件模型和非最大抑制。我们混合模型的联合训练显示出明显优于单独训练每个组件的优势,并在标准基准上取得了竞争性结果。

著录项

  • 作者

    Wan, Li.;

  • 作者单位

    New York University.;

  • 授予单位 New York University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 103 p.
  • 总页数 103
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号