...
首页> 外文期刊>Procedia Computer Science >Generalized canonical correlation analysis for labeled data
【24h】

Generalized canonical correlation analysis for labeled data

机译:标记数据的广义规范相关分析

获取原文

摘要

Multi-view learning is a method that is used to extract standard features from different information sources in various fields such as medical data analysis, computer vision, and web data analysis. Canonical correlation analysis (CCA), a dimensionality reduction method used for the data acquired in multi-view learning, can extract a low-dimensional space where the correlation between two multivariate data is high. However, CCA has the following problems. First, it is difficult to perform dimensionality reduction while taking advantage of the label information attached to the data. Second, CCA is an analysis method for two sets of data; it cannot be directly applied when we have three or more datasets. Discriminative canonical correlation analysis (DCCA) can be used to solve the first problem. It enables the dimensionality reduction of two datasets while reflecting the label information. Further, generalized canonical correlation analysis (GCCA) can be used to solve the second problem. It calculates canonical correlation variables, which are the products of parameters and data, for three or more datasets, and assumes that all datasets aggregate to a piece of shared information so that the parameters aggregate the information of each data. However, DCCA and GCCA do not simultaneously solve the problems of CCA. Therefore, in this study, we extend DCCA and propose a dimensionality reduction method using the label information for three or more datasets. We validate the usefulness of the proposed method through a simulation study.
机译:多视点学习是用于提取从在各种领域中不同的信息源的标准特征,例如医疗数据分析,计算机视觉,和网页数据分析的方法。典型相关分析(CCA),用于在多视图学习获取的数据的降维的方法,可以提取的低维空间,其中两个多变量数据之间的相关性是高的。然而,CCA具有以下问题。首先,它是难以同时利用附加到数据标签信息优点进行维数降低。第二,CCA为两组数据的分析方法;它不能当我们有三个或多个数据集可以直接应用。判别典型相关分析(DCCA)可用于解决第一个问题。它使两个数据集的降维,同时反映了标签信息。此外,广义典型相关分析(GCCA)可以用来解决第二个问题。它可以计算典型相关变量,它们是参数和数据,用于三个或更多个数据集的产品,并假设所有数据集聚集到一个片的共享信息,使得聚集参数各数据的信息。然而,DCCA和GCCA不同时解决CCA的问题。因此,在这项研究中,我们扩展DCCA和使用标签信息,三个或三个以上的数据集提出了一个降维方法。我们通过模拟研究验证了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号