首页> 美国卫生研究院文献>PLoS Computational Biology >A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data
【2h】

A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data

机译:二次正则化函数规范相关分析,用于利用NGS数据识别多效性的整体结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics.
机译:研究遗传变异的多效性可以提高统计能力,提供重要信息以深入了解疾病的复杂遗传结构,并提供强大的工具来设计副作用少的有效治疗方法。但是,当前的多表型关联分析范式缺乏广度(同时分析共同表型的数量和遗传变异)和深度(表型和基因型的层次结构)。高维多效性分析的关键问题是从高维基因型和表型数据中有效提取信息性内部表示和特征。为了探索遗传变异的相关信息,有效地减少数据量并克服关键障碍,从而促进了遗传多效性分析的新统计方法和计算算法的发展,我们提出了一种称为二次正则化函数CCA(QRFCCA)的新统计方法结合三种方法的关联分析:(1)二次正则化矩阵分解,(2)功能数据分析和(3)典型相关分析(CCA)。大规模模拟显示,QRFCCA具有比十个竞争统计数据更高的功效,同时保留了适当的1类错误。为了进一步评估性能,将QRFCCA和其他十项统计数据应用于TwinsUK研究的全基因组测序数据集。我们使用QRFCCA鉴定了总共79个具有罕见变异的基因和67个具有常见变异的基因,这些显着相关的46个性状。结果表明,QRFCCA明显优于其他十项统计数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号