Block coordinate descent algorithms for large-scale sparse multiclass classification

Mathieu Blondel; Kazuhiro Seki; Kuniaki Uehara

首页> 外文期刊>Machine Learning >Block coordinate descent algorithms for large-scale sparse multiclass classification

【24h】

Block coordinate descent algorithms for large-scale sparse multiclass classification

机译：大规模稀疏多类分类的块坐标下降算法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Over the past decade, e_1 regularization has emerged as a powerful way to learn classifiers with implicit feature selection. More recently, mixed-norm (e.g., e_1/e_2) regularization has been utilized as a way to select entire groups of features. In this paper, we propose a novel direct multiclass formulation specifically designed for large-scale and high-dimensional problems such as document classification. Based on a multiclass extension of the squared hinge loss, our formulation employs e_1/e_2 regularization so as to force weights corresponding to the same features to be zero across all classes, resulting in compact and fast-to-evaluate multiclass models. For optimization, we employ two globally-convergent variants of block coordinate descent, one with line search (Tseng and Yun in Math. Program. 117:387-423, 2009) and the other without (Richtarik and Takac in Math. Program. 1-38, 2012a; Tech. Rep. arXiv:1212.0873, 2012b). We present the two variants in a unified manner and develop the core components needed to efficiently solve our formulation. The end result is a couple of block coordinate descent algorithms specifically tailored to our multiclass formulation. Experimentally, we show that block coordinate descent performs favorably compared to other solvers such as FOBOS, FISTA and SpaRSA. Furthermore, we show that our formulation obtains very compact multiclass models and outperforms e_1/e_2 regularized multiclass logistic regression in terms of training speed, while achieving comparable test accuracy.

机译：在过去的十年中，e_1正则化已成为学习具有隐式特征选择的分类器的有力方法。最近，混合范数（例如，e_1 / e_2）正则化已被用作选择整个特征组的方式。在本文中，我们提出了一种新颖的直接多类表述，专门针对大规模和高维问题（例如文档分类）而设计。基于平方铰链损耗的多类扩展，我们的公式采用e_1 / e_2正则化，以便在所有类中强制将对应于相同特征的权重设为零，从而生成紧凑且易于评估的多类模型。为了进行优化，我们使用了块坐标下降的两个全局收敛变体，一个带有行搜索（Tseng和Yun在Math。Program。117：387-423，2009），另一个不带行搜索（Richtarik和Takac在Math。Program。1）。 -38，2012a； Tech.Rep.arXiv：1212.0873，2012b）。我们以统一的方式介绍这两种变体，并开发有效解决我们的配方所需的核心组件。最终结果是专门为我们的多类公式量身定制的几个块坐标下降算法。实验表明，与其他求解器（如FOBOS，FISTA和SpaRSA）相比，块坐标下降的性能更好。此外，我们表明我们的公式获得了非常紧凑的多类模型，并且在训练速度方面优于e_1 / e_2正则化多类逻辑回归，同时实现了相当的测试准确性。

著录项

来源
《Machine Learning》 |2013年第1期|31-52|共22页
作者
Mathieu Blondel; Kazuhiro Seki; Kuniaki Uehara;
展开▼
作者单位

Graduate School of System Informatics, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan;

Graduate School of System Informatics, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan;

Graduate School of System Informatics, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Multiclass classification; Group sparsity; Block coordinate descent;

机译：多类分类;小组稀疏;块坐标下降;

相似文献

外文文献
中文文献
专利

1. Block coordinate descent algorithms for large-scale sparse multiclass classification [J] . Mathieu Blondel, Kazuhiro Seki, Kuniaki Uehara Machine Learning . 2013,第1期

机译：大规模稀疏多类分类的块坐标下降算法
2. Indexed Block Coordinate Descent for Large-Scale Linear Classification with Limited Memory [J] . Ian E. H. Yen, Chun-Fu Chang, Ting-Wei Lin, SIGKDD explorations . 2013,第CDaROM期

机译：具有受限内存的大规模线性分类的索引块坐标下降
3. Convergence of Slice-Based Block Coordinate Descent Algorithm for Convolutional Sparse Coding [J] . Jing Li, Hui Yu, Xiao Wei, Mathematical Problems in Engineering: Theory, Methods and Applications . 2020,第1期

机译：基于切片的块坐标序列算法融合稀疏编码的融合
4. A random coordinate descent algorithm for large-scale sparse nonconvex optimization [C] . Patrascu Andrei, Necoara Ion European Control Conference . 2013

机译：大规模稀疏非凸优化的随机坐标下降算法
5. Block Coordinate Descent for Regularized Multi-convex Optimization. [D] . Xu, Yangyang. 2013

机译：块坐标下降，可进行规则化的多凸优化。
6. Stochastic block coordinate Frank-Wolfe algorithm for large-scale biological network alignment [O] . Yijie Wang, Xiaoning Qian 2016

机译：大规模生物网络对准的随机块坐标Frank-Wolfe算法
7. Convergence of Slice-Based Block Coordinate Descent Algorithm for Convolutional Sparse Coding [O] . Jing Li, Hui Yu, Xiao Wei, 2020

机译：基于切片的块坐标序列算法融合稀疏编码的融合

Block coordinate descent algorithms for large-scale sparse multiclass classification

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅