首页> 外文学位 >Computational models of early visual processing layers.
【24h】

Computational models of early visual processing layers.

机译:早期视觉处理层的计算模型。

获取原文
获取原文并翻译 | 示例

摘要

Visual information passes through layers of processing along the visual pathway, such as retina, lateral geniculate nucleus (LGN), primary visual cortex (V1), prestriate cortex (V2), and beyond. Understanding the functional roles of these visual processing layers will not only help to understand psychophysical and neuroanatomical observations of these layers, but also would help to build intelligent computer vision systems that exhibit human-like behaviors and performance. One of the popular theories about the functional role of visual perception, the efficient coding theory, hypothesizes that the early visual processing layers serve to capture the statistical structure of the visual inputs by removing the redundancy in the visual outputs. Linear implementations of the efficient coding theory, such as independent component analysis (ICA) and sparse coding, learn visual features exhibiting the receptive field properties of V1 simple cells when they are applied to grayscale image patches.;In this dissertation, we explore different aspects of the early visual processing layers by building computational models following the efficient coding theory.;(1)We develop a hierarchical model, Recursive ICA, that captures nonlinear statistical structures of the visual inputs that cannot be captured by a single layer of ICA. The model is motivated by the idea that higher layers of the visual pathway, such as V2, might work under similar computational principles as the primary visual cortex. Hence we apply a second layer of ICA on top of the first layer ICA outputs. To allow the second layer of ICA to better capture nonlinear statistical structures, we derive a coordinate-wise nonlinear activation function that transforms the first layer ICA's outputs to the second layer ICA's inputs. When applied to grayscale image patches, the model's second layer learns nonlinear visual features, such as texture boundaries and shape contours.;We apply the above model to natural scene images, such as forest and grassland, to learn some generic visual features, and then use these features for face and handwritten digit recognition. We get higher recognition rates than those systems built with features designed for face and digit recognition.;(2) We show that retinal coding, the pre-cortical stage of visual processing, can also be explained by the efficient coding theory. The retinal coding model turns out to be another variation of Sparse PCA, a technique widely applied in signal processing, financial analysis, bioinformatics, etc. Compared with ICA, which removes the redundancy among the input dimensions, Sparse PCA removes redundancy among the input samples. We apply Sparse PCA to grayscale images, chromatic images, grayscale videos, environmental sound, and human speech, and learn visual and auditory features that exhibit the filtering properties of retinal ganglion cells and auditory nerve fibers. This work suggests that the pre-cortical stages of visual and auditory pathway might work under similar computational principles.
机译:视觉信息会沿着视觉通路穿过各个处理层,例如视网膜,外侧膝状核(LGN),初级视觉皮层(V1),硬脑膜皮层(V2)等。了解这些视觉处理层的功能角色不仅将有助于理解这些层的心理,生理和神经解剖学观察,而且还将有助于构建展现出类人行为和性能的智能计算机视觉系统。关于视觉感知功能作用的流行理论之一是有效的编码理论,它假设早期的视觉处理层可通过消除视觉输出中的冗余来捕获视觉输入的统计结构。有效编码理论的线性实现,例如独立分量分析(ICA)和稀疏编码,学习了将V1简单单元格应用于灰度图像块时表现出可视场特性的视觉特征。 (1)我们开发了一个递归ICA分层模型,该模型可以捕获视觉输入的非线性统计结构,而该统计结构不能被ICA的单个层捕获。该模型的灵感来自于视觉通路的更高层(例如V2)可能在与主要视觉皮层相似的计算原理下工作。因此,我们在第一层ICA输出的顶部应用第二层ICA。为了使ICA的第二层更好地捕获非线性统计结构,我们推导了将第一层ICA的输出转换为第二层ICA的输入的坐标非线性激活函数。当应用于灰度图像补丁时,模型的第二层学习非线性视觉特征,例如纹理边界和形状轮廓。我们将上述模型应用于自然场景图像(例如森林和草地),以学习一些通用视觉特征,然后使用这些功能进行面部和手写数字识别。我们获得的识别率要高于那些具有专为面部和手指识别而设计的功能的系统。(2)我们证明,视网膜编码,即视觉处理的皮层前阶段,也可以用有效的编码理论来解释。视网膜编码模型是Sparse PCA的另一种变体,该技术广泛应用于信号处理,财务分析,生物信息学等。与ICA相比,稀疏PCA消除了输入维之间的冗余,而稀疏PCA消除了输入样本之间的冗余。 。我们将稀疏PCA应用到灰度图像,彩色图像,灰度视频,环境声音和人类语音,并学习显示视网膜神经节细胞和听神经纤维过滤特性的视觉和听觉特征。这项工作表明视觉和听觉通路的皮质前阶段可能在类似的计算原理下工作。

著录项

  • 作者

    Shan, Honghao.;

  • 作者单位

    University of California, San Diego.;

  • 授予单位 University of California, San Diego.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 97 p.
  • 总页数 97
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号