首页> 外文期刊>Machine Learning >Block coordinate descent algorithms for large-scale sparse multiclass classification
【24h】

Block coordinate descent algorithms for large-scale sparse multiclass classification

机译:大规模稀疏多类分类的块坐标下降算法

获取原文
获取原文并翻译 | 示例

摘要

Over the past decade, e_1 regularization has emerged as a powerful way to learn classifiers with implicit feature selection. More recently, mixed-norm (e.g., e_1/e_2) regularization has been utilized as a way to select entire groups of features. In this paper, we propose a novel direct multiclass formulation specifically designed for large-scale and high-dimensional problems such as document classification. Based on a multiclass extension of the squared hinge loss, our formulation employs e_1/e_2 regularization so as to force weights corresponding to the same features to be zero across all classes, resulting in compact and fast-to-evaluate multiclass models. For optimization, we employ two globally-convergent variants of block coordinate descent, one with line search (Tseng and Yun in Math. Program. 117:387-423, 2009) and the other without (Richtarik and Takac in Math. Program. 1-38, 2012a; Tech. Rep. arXiv:1212.0873, 2012b). We present the two variants in a unified manner and develop the core components needed to efficiently solve our formulation. The end result is a couple of block coordinate descent algorithms specifically tailored to our multiclass formulation. Experimentally, we show that block coordinate descent performs favorably compared to other solvers such as FOBOS, FISTA and SpaRSA. Furthermore, we show that our formulation obtains very compact multiclass models and outperforms e_1/e_2 regularized multiclass logistic regression in terms of training speed, while achieving comparable test accuracy.
机译:在过去的十年中,e_1正则化已成为学习具有隐式特征选择的分类器的有力方法。最近,混合范数(例如,e_1 / e_2)正则化已被用作选择整个特征组的方式。在本文中,我们提出了一种新颖的直接多类表述,专门针对大规模和高维问题(例如文档分类)而设计。基于平方铰链损耗的多类扩展,我们的公式采用e_1 / e_2正则化,以便在所有类中强制将对应于相同特征的权重设为零,从而生成紧凑且易于评估的多类模型。为了进行优化,我们使用了块坐标下降的两个全局收敛变体,一个带有行搜索(Tseng和Yun在Math。Program。117:387-423,2009),另一个不带行搜索(Richtarik和Takac在Math。Program。1)。 -38,2012a; Tech.Rep.arXiv:1212.0873,2012b)。我们以统一的方式介绍这两种变体,并开发有效解决我们的配方所需的核心组件。最终结果是专门为我们的多类公式量身定制的几个块坐标下降算法。实验表明,与其他求解器(如FOBOS,FISTA和SpaRSA)相比,块坐标下降的性能更好。此外,我们表明我们的公式获得了非常紧凑的多类模型,并且在训练速度方面优于e_1 / e_2正则化多类逻辑回归,同时实现了相当的测试准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号