首页> 外文会议>International Symposium on Methodologies for Intelligent Systems >Extending Logistic Regression Models with Factorization Machines
【24h】

Extending Logistic Regression Models with Factorization Machines

机译:使用分解机扩展Logistic回归模型

获取原文

摘要

Including categorical variables with many levels in a logistic regression model easily leads to a sparse design matrix. This can result in a big, ill-conditioned optimization problem causing overfitting, extreme coefficient values and long run times. Inspired by recent developments in matrix factorization, we propose four new strategies of overcoming this problem. Each strategy uses a Factorization Machine that transforms the categorical variables with many levels into a few numeric variables that are subsequently used in the logistic regression model. The application of Factorization Machines also allows for including interactions between the categorical variables with many levels, often substantially increasing model accuracy. The four strategies have been tested on four data sets, demonstrating superiority of our approach over other methods of handling categorical variables with many levels. In particular, our approach has been successfully used for developing high quality risk models at the Netherlands Tax and Customs Administration.
机译:包括逻辑回归模型中具有许多级别的分类变量容易导致稀疏的设计矩阵。这可能导致大,不良状态的优化问题导致过度拟合,极端系数值和长时间的长时间。灵感来自最近的矩阵分解的发展,我们提出了四种克服这个问题的新策略。每个策略都使用一个分解机,该机器将分类变量转换为几个级别的分类变量进入几个数字变量,随后在逻辑回归模型中使用。分解机的应用还允许包括具有许多级别的分类变量之间的相互作用,通常显着增加模型精度。四种策略已经在四个数据集中进行了测试,展示了我们对处理具有许多级别的分类变量的其他方法的方法。特别是,我们的方法已成功地用于在荷兰税和海关管理处开发高质量的风险模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号