A persistent worry with computational models of unsupervised learning is that learning will become more difficult as the problem is scaled. We examine this issue in the context of a novel hierarchical, generative model that can be viewed as anon-linear generalization of factor analysis and can be implemented in a neural network. The model performs perceptual inference in a probabilistically consistent manner by using top-down, bottom-up and lateral connections. These connections can belearned using simple rules that require only locally available information. We first demonstrate that the model can extract a sparse, distributed, hierarchical representation of global disparity from simplified random-dot stereograms. We then investigatesome of the scaling properties of the algorithm on this problem and find that: (1) Increasing the image size leads to faster and more reliable learning; (2) Increasing the depth of the network from one to two hidden layers leads to better representationsat the first hidden layer, and (3) Once one part of the network has discovered how to represent disparity, it "supervises" other parts of the network, greatly speeding up their learning.
展开▼