Computational models of early visual processing layers.

机译：早期视觉处理层的计算模型。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Visual information passes through layers of processing along the visual pathway, such as retina, lateral geniculate nucleus (LGN), primary visual cortex (V1), prestriate cortex (V2), and beyond. Understanding the functional roles of these visual processing layers will not only help to understand psychophysical and neuroanatomical observations of these layers, but also would help to build intelligent computer vision systems that exhibit human-like behaviors and performance. One of the popular theories about the functional role of visual perception, the efficient coding theory, hypothesizes that the early visual processing layers serve to capture the statistical structure of the visual inputs by removing the redundancy in the visual outputs. Linear implementations of the efficient coding theory, such as independent component analysis (ICA) and sparse coding, learn visual features exhibiting the receptive field properties of V1 simple cells when they are applied to grayscale image patches.;In this dissertation, we explore different aspects of the early visual processing layers by building computational models following the efficient coding theory.;(1)We develop a hierarchical model, Recursive ICA, that captures nonlinear statistical structures of the visual inputs that cannot be captured by a single layer of ICA. The model is motivated by the idea that higher layers of the visual pathway, such as V2, might work under similar computational principles as the primary visual cortex. Hence we apply a second layer of ICA on top of the first layer ICA outputs. To allow the second layer of ICA to better capture nonlinear statistical structures, we derive a coordinate-wise nonlinear activation function that transforms the first layer ICA's outputs to the second layer ICA's inputs. When applied to grayscale image patches, the model's second layer learns nonlinear visual features, such as texture boundaries and shape contours.;We apply the above model to natural scene images, such as forest and grassland, to learn some generic visual features, and then use these features for face and handwritten digit recognition. We get higher recognition rates than those systems built with features designed for face and digit recognition.;(2) We show that retinal coding, the pre-cortical stage of visual processing, can also be explained by the efficient coding theory. The retinal coding model turns out to be another variation of Sparse PCA, a technique widely applied in signal processing, financial analysis, bioinformatics, etc. Compared with ICA, which removes the redundancy among the input dimensions, Sparse PCA removes redundancy among the input samples. We apply Sparse PCA to grayscale images, chromatic images, grayscale videos, environmental sound, and human speech, and learn visual and auditory features that exhibit the filtering properties of retinal ganglion cells and auditory nerve fibers. This work suggests that the pre-cortical stages of visual and auditory pathway might work under similar computational principles.

机译：视觉信息会沿着视觉通路穿过各个处理层，例如视网膜，外侧膝状核（LGN），初级视觉皮层（V1），硬脑膜皮层（V2）等。了解这些视觉处理层的功能角色不仅将有助于理解这些层的心理，生理和神经解剖学观察，而且还将有助于构建展现出类人行为和性能的智能计算机视觉系统。关于视觉感知功能作用的流行理论之一是有效的编码理论，它假设早期的视觉处理层可通过消除视觉输出中的冗余来捕获视觉输入的统计结构。有效编码理论的线性实现，例如独立分量分析（ICA）和稀疏编码，学习了将V1简单单元格应用于灰度图像块时表现出可视场特性的视觉特征。（1）我们开发了一个递归ICA分层模型，该模型可以捕获视觉输入的非线性统计结构，而该统计结构不能被ICA的单个层捕获。该模型的灵感来自于视觉通路的更高层（例如V2）可能在与主要视觉皮层相似的计算原理下工作。因此，我们在第一层ICA输出的顶部应用第二层ICA。为了使ICA的第二层更好地捕获非线性统计结构，我们推导了将第一层ICA的输出转换为第二层ICA的输入的坐标非线性激活函数。当应用于灰度图像补丁时，模型的第二层学习非线性视觉特征，例如纹理边界和形状轮廓。我们将上述模型应用于自然场景图像（例如森林和草地），以学习一些通用视觉特征，然后使用这些功能进行面部和手写数字识别。我们获得的识别率要高于那些具有专为面部和手指识别而设计的功能的系统。（2）我们证明，视网膜编码，即视觉处理的皮层前阶段，也可以用有效的编码理论来解释。视网膜编码模型是Sparse PCA的另一种变体，该技术广泛应用于信号处理，财务分析，生物信息学等。与ICA相比，稀疏PCA消除了输入维之间的冗余，而稀疏PCA消除了输入样本之间的冗余。。我们将稀疏PCA应用到灰度图像，彩色图像，灰度视频，环境声音和人类语音，并学习显示视网膜神经节细胞和听神经纤维过滤特性的视觉和听觉特征。这项工作表明视觉和听觉通路的皮质前阶段可能在类似的计算原理下工作。

著录项

作者
Shan, Honghao.;
展开▼
作者单位

University of California, San Diego.;

展开▼
授予单位 University of California, San Diego.;
学科 Computer Science.
学位 Ph.D.
年度 2010
页码 97 p.
总页数 97
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Context integration in visual processing: a computational model of center-surround suppression in the visual system [J] . Christoph Metzner, Achim Schweikard, Bartosz Zurowski BMC Neuroscience . 2010,第SUPPLEMENTa1期

机译：视觉处理中的上下文集成：视觉系统中的中心环绕抑制计算模型
2. Auditory alerting enhances visual attentional processing: Evidence from computational modeling and event-related lateralizations [J] . Iris Wiegand, Anders Petersen, Jon Lansner, Journal of vision . 2016,第12期

机译：听觉警报可增强视觉注意力处理：来自计算模型和事件相关偏侧化的证据
3. Computational models of cortical visual processing [J] . David J. Heeger, Eero P. Simoncelli, J. Anthony Movshon Proceedings of the National Academy of Sciences of the United States of America . 1996,第2期

机译：皮层视觉处理的计算模型
4. Hyperbolic Modeling for Metaphorical Processing and Visual Computations [C] . Hawley K.Rising III SPIE Conference on Human Vision and Electronic Imaging . 2009

机译：隐喻加工和视觉计算的双曲模型
5. Computational modeling of visual motion processing neurons in the dorsal medial superior temporal area (MSTD): Functional architecture and learning mechanisms. [D] . Pitts, Robert Ian. 2004

机译：内侧内侧颞上区域（MSTD）中视觉运动处理神经元的计算模型：功能架构和学习机制。
6. Context integration in visual processing: a computational model of center-surround suppression in the visual system [O] . Christoph Metzner, Achim Schweikard, Bartosz Zurowski 2010

机译：视觉处理中的上下文集成：视觉系统中中心周围抑制的计算模型
7. Context integration in visual processing: a computational model of center-surround suppression in the visual system [O] . Schweikard Achim, Metzner Christoph, Zurowski Bartosz 2010

机译：视觉处理中的上下文集成：视觉系统中中心周围抑制的计算模型
8. Computational Models of the Perceptual, Cognitive, and Motor Processes Involved in the Visual Search of Pull-Down Menus and Computer Screens [R] . Hornof, A. J. 1999

机译：下拉菜单和计算机屏幕视觉搜索中涉及的感知，认知和运动过程的计算模型

Computational models of early visual processing layers.

摘要

著录项

相似文献

相关主题

期刊订阅