首页> 外文OA文献 >Improved learning algorithms for mixture of experts in multiclass classification
【2h】

Improved learning algorithms for mixture of experts in multiclass classification

机译:改进了多类分类专家混合的学习算法

摘要

Mixture of experts (ME) is a modular neural network architecture for supervised learning. A double-loop Expectation-Maximization (EM) algorithm has been introduced to the ME architecture for adjusting the parameters and the iteratively reweighted least squares (IRLS) algorithm is used to perform maximization in the inner loop [Jordan, M.I., Jacobs, R.A. (1994). Hierarchical mixture of experts and the EM algorithm, Neural Computation, 6(2), 181-214]. However, it is reported in literature that the IRLS algorithm is of instability and the ME architecture trained by the EM algorithm, where IRLS algorithm is used in the inner loop, often produces the poor performance in multiclass classification. In this paper, the reason of this instability is explored. We find out that due to an implicitly imposed incorrect assumption on parameter independence in multiclass classification, an incomplete Hessian matrix is used in that IRLS algorithm. Based on this finding, we apply the Newton-Raphson method to the inner loop of the EM algorithm in the case of multiclass classification, where the exact Hessian matrix is adopted. To tackle the expensive computation of the Hessian matrix and its inverse, we propose an approximation to the Newton-Raphson algorithm based on a so-called generalized Bernoulli density. The Newton-Raphson algorithm and its approximation have been applied to synthetic data, benchmark, and real-world multiclass classification tasks. For comparison, the IRLS algorithm and a quasi-Newton algorithm called BFGS have also been applied to the same tasks. Simulation results have shown that the use of the proposed learning algorithms avoids the instability problem and makes the ME architecture produce good performance in multiclass classification. In particular, our approximation algorithm leads to fast learning. In addition, the limitation of our approximation algorithm is also empirically investigated in this paper.
机译:专家混合(ME)是一种用于监督学习的模块化神经网络体系结构。已将双循环期望最大化(EM)算法引入到ME体系结构中以调整参数,并且使用迭代重加权最小二乘(IRLS)算法在内部循环中执行最大化[Jordan,M.I.,Jacobs,R.A. (1994)。专家和EM算法的层次混合,神经计算,6(2),181-214]。然而,据文献报道,IRLS算法不稳定,在内部循环中使用IRLS算法的,由EM算法训练的ME体系结构经常在多类分类中产生较差的性能。在本文中,探讨了这种不稳定的原因。我们发现,由于在多类分类中对参数独立性暗含了不正确的假设,因此该IRLS算法中使用了不完整的Hessian矩阵。基于此发现,在采用精确的Hessian矩阵的多类分类的情况下,我们将Newton-Raphson方法应用于EM算法的内部循环。为了解决Hessian矩阵及其逆的昂贵计算,我们提出了一种基于所谓的广义伯努利密度的牛顿-拉夫森算法的近似方法。牛顿-拉夫森算法及其近似方法已应用于合成数据,基准测试和现实世界中的多类分类任务。为了进行比较,IRLS算法和称为BFGS的准牛顿算法也已应用于相同的任务。仿真结果表明,所提学习算法的使用避免了不稳定性问题,使ME体系结构在多类分类中表现出良好的性能。特别是,我们的近似算法可导致快速学习。此外,本文还通过经验研究了近似算法的局限性。

著录项

  • 作者

    Chen K.; Xu L.; Chi H.;

  • 作者单位
  • 年度 1999
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号