...
首页> 外文期刊>Statistics and computing >Statistical modelling of artificial neural networks using the multi-layer perceptron
【24h】

Statistical modelling of artificial neural networks using the multi-layer perceptron

机译:使用多层感知器的人工神经网络的统计建模

获取原文
获取原文并翻译 | 示例

摘要

Multi-layer perceptrons (MLPs), a common type of artificial neural networks (ANNs), are widely used in computer science and engineering for object recognition, discrimination and classification, and have more recently found use in process monitoring and control. "Training" such networks is not a straightforward optimisation problem, and we examine features of these networks which contribute to the optimisation difficulty. Although the original "perceptron", developed in the late 1950s (Rosenblatt 1958, Widrow and Hoff 1960), had a binary output from each "node", this was not compatible with back-propagation and similar training methods for the MLP. Hence the output of each node (and the final network output) was made a differentiable function of the network inputs. We reformulate the MLP model with the original perceptron in mind so that each node in the "hidden layers" can be considered as a latent (that is, unobserved) Bernoulli random variable. This maintains the property of binary output from the nodes, and with an imposed logistic regression of the hidden layer nodes on the inputs, the expected output of our model is identical to the MLP output with a logistic sigmoid activation function (for the case of one hidden layer). We examine the usual MLP objective function―the sum of squares―and show its multi-modal form and the corresponding optimisation difficulty. We also construct the likelihood for the reformulated latent variable model and maximise it by standard finite mixture ML methods using an EM algorithm, which provides stable ML estimates from random starting positions without the need for regularisation or cross-validation. Over-fitting of the number of nodes does not affect this stability. This algorithm is closely related to the EM algorithm of Jordan and Jacobs (1994) for the Mixture of Experts model. We conclude with some general comments on the relation between the MLP and latent variable models.
机译:多层感知器(MLP)是一种常见的人工神经网络(ANN),在计算机科学和工程中广泛用于对象识别,区分和分类,并且最近在过程监视和控制中得到了使用。 “训练”这样的网络不是一个简单的优化问题,我们研究这些网络的特征,这些特征会导致优化困难。尽管最初的“感知器”是在1950年代后期开发的(Rosenblatt 1958,Widrow和Hoff 1960)从每个“节点”输出二进制数据,但这与MLP的反向传播和类似的训练方法不兼容。因此,使每个节点的输出(以及最终的网络输出)成为网络输入的可微函数。我们考虑到原始感知器来重新构造MLP模型,以便“隐藏层”中的每个节点都可以被视为潜在的(即未观察到的)伯努利随机变量。这保持了节点的二进制输出的属性,并且通过对输入上的隐藏层节点进行逻辑回归,我们模型的预期输出与具有逻辑S形激活函数的MLP输出相同(对于一个隐藏层)。我们检查了通常的MLP目标函数-平方和-并显示了其多峰形式和相应的优化难度。我们还为重新构造的潜在变量模型构造了可能性,并使用EM算法通过标准有限混合ML方法将其最大化,该算法从随机起始位置提供稳定的ML估计,而无需进行正则化或交叉验证。节点数量的过度拟合不会影响此稳定性。对于专家混合模型,该算法与Jordan和Jacobs(1994)的EM算法密切相关。我们以关于MLP和潜在变量模型之间关系的一些一般性评论作为结尾。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号