首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis
【24h】

Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis

机译:多标签分类的典范相关分析:最小二乘公式表示,扩展和分析

获取原文
获取原文并翻译 | 示例

摘要

Canonical Correlation Analysis (CCA) is a well-known technique for finding the correlations between two sets of multidimensional variables. It projects both sets of variables onto a lower-dimensional space in which they are maximally correlated. CCA is commonly applied for supervised dimensionality reduction in which the two sets of variables are derived from the data and the class labels, respectively. It is well-known that CCA can be formulated as a least-squares problem in the binary class case. However, the extension to the more general setting remains unclear. In this paper, we show that under a mild condition which tends to hold for high-dimensional data, CCA in the multilabel case can be formulated as a least-squares problem. Based on this equivalence relationship, efficient algorithms for solving least-squares problems can be applied to scale CCA to very large data sets. In addition, we propose several CCA extensions, including the sparse CCA formulation based on the 1-norm regularization. We further extend the least-squares formulation to partial least squares. In addition, we show that the CCA projection for one set of variables is independent of the regularization on the other set of multidimensional variables, providing new insights on the effect of regularization on CCA. We have conducted experiments using benchmark data sets. Experiments on multilabel data sets confirm the established equivalence relationships. Results also demonstrate the effectiveness and efficiency of the proposed CCA extensions.
机译:典范相关分析(CCA)是一种众所周知的技术,用于查找两组多维变量之间的相关性。它将两组变量都投影到一个最小维的空间中,这些变量之间具有最大的相关性。 CCA通常用于监督降维,其中两组变量分别从数据和类标签派生。众所周知,在二元类情况下,CCA可以表述为最小二乘问题。但是,扩展到更一般的设置仍然不清楚。在本文中,我们表明,在倾向于保留高维数据的温和条件下,可以将多标签情况下的CCA表示为最小二乘问题。基于此等价关系,可以将用于解决最小二乘问题的有效算法应用于将CCA缩放到非常大的数据集。此外,我们提出了几种CCA扩展,包括基于1-范数正则化的CCA稀疏公式。我们进一步将最小二乘公式扩展为部分最小二乘。此外,我们表明,一组变量的CCA投影独立于另一组多维变量的正则化,从而为正则化对CCA的影响提供了新的见解。我们已经使用基准数据集进行了实验。多标签数据集上的实验证实了已建立的等价关系。结果还证明了建议的CCA扩展的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号