【24h】

Effect fusion using model-based clustering

机译:使用基于模型的聚类实现融合

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract: In social and economic studies many of the collected variables are measured on a nominal scale, often with a large number of categories. The definition of categories can be ambiguous and different classification schemes using either a finer or a coarser grid are possible. Categorization has an impact when such a variable is included as covariate in a regression model: a too fine grid will result in imprecise estimates of the corresponding effects, whereas with a too coarse grid important effects will be missed, resulting in biased effect estimates and poor predictive performance.To achieve an automatic grouping of the levels of a categorical covariate with essentially the same effect, we adopt a Bayesian approach and specify the prior on the level effects as a location mixture of spiky Normal components. Model-based clustering of the effects during MCMC sampling allows to simultaneously detect categories which have essentially the same effect size and identify variables with no effect at all. Fusion of level effects is induced by a prior on the mixture weights which encourages empty components. The properties of this approach are investigated in simulation studies. Finally, the method is applied to analyse effects of high-dimensional categorical predictors on income in Austria.
机译:摘要:在社会和经济研究中,许多收集的变量是以名义规模测量的,通常具有大量类别。类别的定义可以是模糊的,并且可以使用更精细或更粗糙的网格进行不同的分类方案。当在回归模型中包含这样一个变量时,分类会产生影响:过细网格将导致对应效果的不精确估计,而将错过过于粗略的网格,导致偏差估计和差预测性能。为了实现基本相同的效果的分类协变量的自动分组,我们采用贝叶斯方法,并在水平效应上指定作为尖峰正常组分的位置混合物。基于模型的MCMC采样期间的效果群体允许同时检测具有基本上相同的效果大小的类别,并识别无效果的变量。通过在混合体内诱导促进空心组分的融合的融合。在仿真研究中研究了这种方法的性质。最后,该方法应用于分析高维范围预测因子对奥地利收入的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号