Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities

Sumio Watanabe; Shun-ichi Amari

首页> 外文期刊>Neural computation >Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities

【24h】

Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities

机译：当真实分布与奇异点不匹配时学习分层模型的系数

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hierarchical learning machines such as layered neural networks have singularities in their parameter spaces. At singularities, the Fisher information matrix becomes degenerate, with the result that the conventional learning theory of regular statistical models does not hold. Recently, it was proved that if the parameter of the true distribution is contained in the singularities of the learning machine, the generalization error in Bayes estimation is asymptotically equal to λ/ n, where 2λ is smaller than the dimension of the parameter and n is the number of training samples. However, the constant 1 strongly depends on the local geometrical structure of singularities; hence, the generalization error is not yet clarified when the true distribution is almost but not completely contained in the singularities. In this article, in order to analyze such cases, we study the Bayes generalization error under the condition that the Kullback distance of the true distribution from the distribution represented by singularities is in proportion to 1 and show two results. First, if the dimension of the parameter from inputs to hidden units is not larger than three, then there exists a region of true parameters such that the generalization error is larger than that of the corresponding regular model. Second, if the dimension from inputs to hidden units is larger than three, then for arbitrary true distribution, the generalization error is smaller than that of the corresponding regular model.

机译：分层学习机（例如分层神经网络）在其参数空间中具有奇异性。在奇异点，Fisher信息矩阵变得退化，结果是常规统计模型的传统学习理论不成立。最近，证明了如果真实分布的参数包含在学习机的奇点中，则贝叶斯估计中的泛化误差渐近等于λ/ n，其中2λ小于参数的维数，n为训练样本数。但是，常数1在很大程度上取决于奇异点的局部几何结构。因此，当真正的分布几乎但不完全包含在奇异点中时，泛化误差尚未得到澄清。在本文中，为了分析这种情况，我们研究了真实分布与奇点表示的分布的库尔贝克距离与1 / n成正比的条件下的贝叶斯泛化误差，并显示了两个结果。首先，如果从输入到隐藏单元的参数维数不大于3，则存在真实参数区域，从而泛化误差大于相应常规模型的泛化误差。其次，如果从输入到隐藏单元的维数大于3，则对于任意真实分布，泛化误差小于相应常规模型的泛化误差。

著录项

来源
《Neural computation》 |2003年第5期|p.1013-1033|共21页
作者
Sumio Watanabe; Shun-ichi Amari;
展开▼
作者单位

Precision and Intelligence Laboratory, Tokyo Institute of Technology, Midori-ku, Yokohama, 226-8503 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《化学文摘》(CA);
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Analysis of the spatial layer discrete cosine transform coefficient distribution and its application to rate model for H.264/SVC encoder [J] . Szu-Wei Lee The Journal of Engineering . 2014,第3期

机译：空间层离散余弦变换系数分布的分析及其在H.264 / SVC编码器速率模型中的应用
2. Predictive model for water absorption in sublayers using a Joint Distribution Adaption based XGBoost transfer learning method [J] . Liu Wei, Liu Wei David, Gu Jianwei Journal of Petroleum Science & Engineering . 2020,第期

机译：基于联合分布适应的XGBoost转移学习方法，子层在子层中吸水的预测模型
3. Data-driven prediction model for adjusting burden distribution matrix of blast furnace based on improved multilayer extreme learning machine [J] . Xiaoli Su, Sen Zhang, Yixin Yin, Soft computing: A fusion of foundations, methodologies and applications . 2018,第11期

机译：基于改进的多层极限学习机调节高炉负荷分布矩阵的数据驱动预测模型
4. The Effect of Singularities in a Learning Machine when the True Parameters Do Not Lie on Such Singularities [C] . Sumio Watanabe, Shun-ichi Amari Annual neural information processing systems conference . 2003

机译：当真正参数不躺在这样的奇点上时，奇点在学习机中的影响
5. The desorption of uranium and thorium from contaminated soils: The role of geochemistry and distribution coefficients in modeling contaminants leaching. [D] . Wang, Yug-Yea (Ying-Ya). 1991

机译：从受污染的土壤中解吸铀和geo：地球化学和分布系数在模拟污染物浸出中的作用。
6. Learning Coefficient of Vandermonde Matrix-Type Singularities in Model Selection [O] . Miki Aoyagi 2019

机译：Vandermonde矩阵型奇异性学习系数
7. Learning Coefficients of Layered Models when the True Distribution Mismatches the Singularities [O] . Sumio Watanabe, Shun-ichi Amari 2003

机译：当真实分布与奇异点不匹配时学习分层模型的系数
8. Water-Pressure Distributions During Landings of a Prismatic Model Having an Angle of Dead Rise of 22 1 deg./2 and Beam-Loading Coefficients of 0.48 and 0.97 [R] . Smiley, R. F. 1952

机译：水平压力分布在棱镜模型的着陆过程中，死角为22 1/2，梁加载系数为0.48和0.97

Learning Coefficients of Layered Models When the True Distribution Mismatches the Singularities

摘要

著录项

相似文献

相关主题

期刊订阅