...
首页> 外文期刊>IEEE Transactions on Signal Processing: A publication of the IEEE Signal Processing Society >Online Categorical Subspace Learning for Sketching Big Data with Misses
【24h】

Online Categorical Subspace Learning for Sketching Big Data with Misses

机译:Online Categorical Subspace Learning for Sketching Big Data with Misses

获取原文
获取原文并翻译 | 示例

摘要

With the scale of data growing every day, reducing the dimensionality (a.k.a. sketching) of high-dimensional data has emerged as a task of paramount importance. Relevant issues to address in this context include the sheer volume of data that may consist of categorical observations, the typically streaming format of acquisition, and the possibly missing entries. To cope with these challenges, this paper develops a novel categorical subspace learning approach to unravel the latent structure for three prominent categorical (bilinear) models, namely, Probit, Tobit, and Logit. The deterministic Probit and Tobit models treat data as quantized values of an analog-valued process lying in a low-dimensional subspace, while the probabilistic Logit model relies on low dimensionality of the data log-likelihood ratios. Leveraging the low intrinsic dimensionality of the sought models, a rank regularized maximum-likelihood estimator is devised, which is then solved recursively via alternating majorization-minimization to sketch high-dimensional categorical data "on the fly." The resultant lightweight first-order algorithms entail highly parallelizable tasks per iteration. In addition, the quantization thresholds are also learned jointly with the subspace to enhance the predictive power of the soughtmodels. Performance of the subspace iterates is analyzed for both infinite and finite data streams, where for the former asymptotic convergence to the stationary point set of the batch estimator is established, while for the latter sublinear regret bounds are derived for the empirical cost. Simulated tests with both synthetic and real-world datasets corroborate the merits of the novel schemes for real-time movie recommendation and chess-game classification.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号