We investigate a class of hierarchical mixtures-of-experts (HMEO models where generalized linear models with nonlinear mean functions of the form ψ(α+x~Tβ) are mixed. Here ψ(·) is the inverse link function. It is shown that mixtures of such mean functions can approximate a class of smooth functions of the form ψ(h(x)), where h(·)∈W~∞_2;K (a Sobolev class over [0,1]~s), as the number of experts m in the network increases. An upper bound of the approximation rate is given as O (m~-2/s) in L_p norm. This rate can be achieved within the family of HME structures with no More than s-layers, where s is the dimension of the prdictor x.
展开▼