首页> 外文期刊>Pattern recognition letters >Selection properties of type II maximum likelihood (empirical Bayes) in linear models with individual variance components for predictors
【24h】

Selection properties of type II maximum likelihood (empirical Bayes) in linear models with individual variance components for predictors

机译:线性模型中具有个体差异分量的线性II型最大似然(经验贝叶斯)的选择属性

获取原文
获取原文并翻译 | 示例
       

摘要

Maximum likelihood (ML) in the linear model overfits when the number of predictors (M) exceeds the number of objects (IV). One of the possible solution is the relevance vector machine (RVM) which is a form of automatic relevance detection and has gained popularity in the pattern recognition machine learning community by the famous textbook of Bishop (2006). RVM assigns individual precisions to weights of predictors which are then estimated by maximizing the marginal likelihood (type II ML or empirical Bayes). We investigated the selection properties of RVM both analytically and by experiments in a regression setting. We show analytically that RVM selects predictors when the absolute z-ratio (|least squares estimate|/ standard error) exceeds 1 in the case of orthogonal predictors and, for M = 2, that this still holds true for correlated predictors when the other z-ratio is large. RVM selects the stronger of two highly correlated predictors. In experiments with real and simulated data, RVM is outcompeted by other popular regular-ization methods (LASSO and/or PLS) in terms of the prediction performance. We conclude that type II ML is not the general answer in high dimensional prediction problems. In extensions of RVM to obtain stronger selection, improper priors (based on the inverse gamma family) have been assigned to the inverse precisions (variances) with parameters estimated by penalized marginal likelihood. We critically assess this approach and suggest a proper variance prior related to the Beta distribution which gives similar selection and shrinkage properties and allows a fully Bayesian treatment.
机译:当预测变量(M)的数量超过对象数量(IV)时,线性模型中的最大似然(ML)过拟合。可能的解决方案之一是关联向量机(RVM),这是一种自动关联检测,在著名的Bishop(2006)教科书中已在模式识别机器学习社区中获得普及。 RVM为预测变量的权重分配单独的精度,然后通过最大化边际可能性(II型ML或经验贝叶斯)来估算。我们分析了回归分析中的RVM的选择特性,并通过实验进行了研究。我们通过分析表明,在正交预测变量的情况下,当绝对z比率(|最小二乘估计| /标准误差)超过1时,RVM选择预测变量;对于M = 2,当其他z相对于相关预测变量时,这仍然适用-比率很大。 RVM选择两个高度相关的预测变量中的较强者。在具有真实数据和模拟数据的实验中,就预测性能而言,RVM优于其他流行的正则化方法(LASSO和/或PLS)。我们得出结论,在高维预测问题中,II型ML不是普遍的答案。在RVM的扩展中,为了获得更强的选择,已将不正确的先验(基于反伽马族)分配给反精度(方差),其参数由受罚边际似然估计。我们严格评估这种方法,并建议与Beta分布相关的适当方差,它具有相似的选择和收缩特性,并允许完全贝叶斯处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号