首页> 美国卫生研究院文献>Frontiers in Genetics >Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach
【2h】

Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach

机译:快速高效多维数据相关性分析的随机投影:一种新方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLSC) is a frequently used method for multivariate multimodal data integration. It is, however, computationally expensive in applications involving large numbers of variables, as required, for example, in genetic neuroimaging. To handle high-dimensional problems, dimension reduction might be implemented as pre-processing step. We propose a new approach that incorporates Random Projection (RP) for dimensionality reduction into PLSC to efficiently solve high-dimensional multimodal problems like genotype-phenotype associations. We name our new method PLSC-RP. Using simulated and experimental data sets containing whole genome SNP measures as genotypes and whole brain neuroimaging measures as phenotypes, we demonstrate that PLSC-RP is drastically faster than traditional PLSC while providing statistically equivalent results. We also provide evidence that dimensionality reduction using RP is data type independent. Therefore, PLSC-RP opens up a wide range of possible applications. It can be used for any integrative analysis that combines information from multiple sources.
机译:近年来,伟大的技术进步的到来产生了大量的高维数据,并且结合来自多种来源的高维信息在越来越多的科学学科中变得越来越重要。偏最小二乘相关(PLSC)是用于多变量多峰数据集成的常用方法。然而,在涉及大量变量的应用中,例如遗传神经成像所要求的,在计算上是昂贵的。为了处理高尺寸问题,可以将尺寸减小实现为预处理步骤。我们提出了一种新方法,该方法将用于降维的随机投影(RP)合并到PLSC中,以有效解决诸如基因型-表型关联之类的高维多峰问题。我们将新方法命名为PLSC-RP。使用包含全基因组SNP指标作为基因型和全脑神经影像指标作为表型的模拟和实验数据集,我们证明PLSC-RP显着快于传统PLSC,同时提供了统计上等价的结果。我们还提供证据表明使用RP进行降维是与数据类型无关的。因此,PLSC-RP开辟了广泛的可能应用。它可用于将来自多个来源的信息进行组合的任何集成分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号