Unraveling complex relationships between heterogeneous omics datasets using local principal components

机译：使用局部主成分解开异构组学数据集之间的复杂关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There is a growing interest in studying the dependencies between multiple data sources. A common way to analyze the relationships between a pair of data sources based on their correlation is canonical correlation analysis (CCA) which seeks for linear combinations of all variables from each dataset which maximize the correlation between them. However, in high dimensional datasets, such as genomic data, where the number of variables exceeds the number of experimental units, CCA may not lead to meaningful information. Moreover, when collinearity exists in one or both the datasets, CCA may not be applicable. In this paper, we present a novel method to extract common features from a pair of data sources using local principal components and Kendalls ranking. The results show that the proposed method outperforms CCA in many scenarios and is more robust to noisy data. Moreover, meaningful results are obtained using the proposed method when the number of variables exceeds the number of observed units.

机译：对研究多个数据源之间的依存关系的兴趣与日俱增。基于相关性分析一对数据源之间的关系的一种常用方法是规范相关分析（CCA），该方法从每个数据集中寻找所有变量的线性组合，以使它们之间的相关性最大化。但是，在高维数据集（例如基因组数据）中，变量的数量超过实验单位的数量，CCA可能不会产生有意义的信息。此外，当一个或两个数据集中都存在共线性时，CCA可能不适用。在本文中，我们提出了一种使用本地主成分和Kendalls排名从一对数据源中提取共同特征的新颖方法。结果表明，所提出的方法在许多情况下均优于CCA，并且对嘈杂的数据更具鲁棒性。此外，当变量的数量超过观察到的单位数量时，使用所提出的方法可以获得有意义的结果。

著录项

来源
《2011 IEEE International Conference on Information Reuse and Integration》|2011年|p.136-141|共6页
会议地点
作者
Alaydie Noor; Fotouhi Farshad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动信息理论;
关键词

相似文献

外文文献
中文文献
专利

1. integrOmics: an R package to unravel relationships between two omics datasets [J] . Le Cao Kim-Anh, Gonzalez Ignacio, Dejean Sebastien Bioinformatics . 2009,第21期

机译：integrOmics：一个R包，用于阐明两个组学数据集之间的关系
2. integrOmics: an R package to unravel relationships between two omics datasets [J] . Kim-Anh Lê Cao1* Ignacio González2 and Sébastien Déjean3 Bioinformatics . 2009,第21期

机译：integrOmics：一个R包，用于阐明两个组学数据集之间的关系
3. Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets [J] . Xin Li International journal of computational models and algorithms in medicine. . 2011,第4期

机译：分布式异构基因表达数据集的隐私保护主成分分析聚类
4. Unraveling complex relationships between heterogeneous omics datasets using local principal components [C] . Alaydie Noor, Fotouhi Farshad IEEE International Conference on Information Reuse and Integration . 2011

机译：使用本地主组件解开异构OMIC数据集之间的复杂关系
5. Integrative Multi-Omic Network Strategies for Unraveling Complex Disease Biology and the Identification of Novel Phenotype Associated Genes [D] . Lancour, Daniel J. 2020

机译：综合多OMIC网络策略，用于解开复杂疾病生物学和新型表型相关基因的鉴定
6. Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets [O] . Nicolas Sompairac, Petr V. Nazarov, Urszula Czerwinska, 2019

机译：独立成分分析可揭示癌症组学数据集的复杂性
7. Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets [O] . Nicolas Sompairac, Petr V. Nazarov, Urszula Czerwinska, 2019

机译：揭开癌症OMICS数据集复杂性的独立分量分析
8. Local Principal Component Pursuit for Nonlinear Datasets. [R] . Wohlberg, B., Chartrand, R., Theiler, J. 2012

机译：非线性数据集的局部主成分追踪。

Unraveling complex relationships between heterogeneous omics datasets using local principal components

摘要

著录项

相似文献

相关主题

期刊订阅