首页> 外文OA文献 >Sparse modeling of high-dimensional data for learning and vision
【2h】

Sparse modeling of high-dimensional data for learning and vision

机译:用于学习和视觉的高维数据的稀疏建模

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Sparse representations account for most or all of the information of a signal by a linear combination of a few elementary signals called atoms, and have increasinglybecome recognized as providing high performance for applications as diverse as noise reduction, compression, inpainting, compressive sensing, patternclassification, and blind source separation. In this dissertation, we learn the sparse representations of high-dimensional signals for various learning and vision tasks,including image classification, single image super-resolution, compressive sensing, and graph learning.Based on the bag-of-features (BoF) image representation in a spatial pyramid, we first transform each local image descriptor into a sparse representation, and then these sparse representations are summarized into a fixed-length feature vector over different spatial locations across different spatial scales by max pooling. Theproposed generic image feature representation properly handles the large in-class variance problem in image classification, and experiments on object recognition,scene classification, face recognition, gender recognition, and handwritten digit recognition all lead to state-of-the-art performances on the benchmark datasets.We cast the image super-resolution problem as one of recovering a high-resolution image patch for each low-resolution image patch based on recent sparse signal recovery theories, which state that, under mild conditions, a high-resolution signal can be recovered from its low-resolution version if the signal has a sparserepresentation in terms of some dictionary. We jointly learn the dictionaries for high- and low-resolution image patches and enforce them to have common sparserepresentations for better recovery. Furthermore, we employ image features and enforce patch overlapping constraints to improve prediction accuracy. Experimentsshow that the algorithm leads to surprisingly good results.Graph construction is critical for those graph-orientated algorithms designed for the purposes of data clustering, subspace learning, and semi-supervised learning. We model the graph construction problem, including neighbor selection andweight assignment, by finding the sparse representation of a data sample with respect to all other data samples. Since natural signals are high-dimensional signalsof a low intrinsic dimension, projecting a signal onto the nearest and lowest dimensional linear subspace is more likely to find its kindred neighbors, and therefore improves the graph quality by avoiding many spurious connections. The proposed graph is informative, sparse, robust to noise, and adaptive to the neighborhood selection; it exhibits exceptionally high performance in various graph-based applications.To this end, we propose a generic dictionary training algorithm that learns more meaningful sparse representations for the above tasks. The dictionary learningalgorithm is formulated as a bilevel optimization problem, which we prove can be solved using stochastic gradient descent. Applications of the generic dictionary training algorithm in supervised dictionary training for image classification, super-resolution, and compressive sensing demonstrate its effectiveness in sparse modeling of natural signals.
机译:稀疏表示是通过几个称为原子的基本信号的线性组合来解释信号的大部分或全部信息,并逐渐被人们认为可为各种应用提供高性能,例如降噪,压缩,修复,修复,压缩感测,模式分类,和盲源分离。在本文中,我们学习了用于各种学习和视觉任务的高维信号的稀疏表示,包括图像分类,单图像超分辨率,压缩感知和图学习。基于特征包(BoF)图像在空间金字塔中表示,我们首先将每个局部图像描述符转换为稀疏表示,然后通过最大池化将这些稀疏表示汇总为跨不同空间比例在不同空间位置上的固定长度特征向量。拟议的通用图像特征表示方法可以正确处理图像分类中的较大类差异问题,并且对象识别,场景分类,面部识别,性别识别和手写数字识别等实验均可以在图像识别方面实现最先进的性能。我们将图像超分辨率问题视为根据最近的稀疏信号恢复理论为每个低分辨率图像补丁恢复高分辨率图像补丁之一,该理论指出,在温和条件下,高分辨率信号可以如果信号在某些字典中具有稀疏表示,则可以从其低分辨率版本中恢复。我们共同学习高分辨率和低分辨率图像补丁的字典,并强制它们具有通用的稀疏表示,以实现更好的恢复。此外,我们采用图像特征并实施补丁重叠约束以提高预测精度。实验表明,该算法取得了令人惊讶的良好结果。对于那些为数据聚类,子空间学习和半监督学习而设计的面向图的算法,图的构建至关重要。通过找到数据样本相对于所有其他数据样本的稀疏表示,我们对图形构造问题进行建模,包括邻居选择和权重分配。由于自然信号是内在维数较低的高维信号,因此将信号投影到最接近且维数最低的线性子空间中的可能性更大,可以找到其亲缘邻居,从而通过避免许多虚假连接来提高图形质量。所提出的图是信息量大,稀疏,对噪声有鲁棒性并且对邻域选择具有适应性。为此,我们提出了一种通用的字典训练算法,该算法可为上述任务学习更有意义的稀疏表示。字典学习算法被公式化为双层优化问题,我们证明可以使用随机梯度下降法解决该问题。通用字典训练算法在图像分类,超分辨率和压缩感测的有监督字典训练中的应用证明了其在自然信号稀疏建模中的有效性。

著录项

  • 作者

    Yang Jianchao;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号