首页> 美国卫生研究院文献>other >Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
【2h】

Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis

机译:协会测试依据sNp集:基于物流核心机试验与主成分分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM) and principal components analysis based approach (PCA) using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD) structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs) and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer.
机译:GWAS极大地促进了与复杂疾病相关的危险SNP的发现。传统方法需要单独分析SNP,并且由于多次比较的校正是必需的,因此受到低功耗和可重复性的限制。基于使用生物学知识和/或基因组特征将SNP分组为SNP集合,已经提出了几种方法。在本文中,我们使用0到3个因果SNP情况下的模拟数据集,比较了基于线性核机器的测试(LKM)和基于主成分分析的方法(PCA),以及简单和复杂的连锁不平衡(LD)结构模拟区域。我们的仿真研究表明,LKM和PCA都可以将I型错误控制在0.05的显着性水平。如果因果SNP与基因型SNP处于强LD之中,则具有少量主成分(PC)的PCA和具有线性或相同状态函数核的LKM都是有效的测试。但是,如果LD结构很复杂,例如SNP集中有几个LD块,或者当因果SNP不在大多数基因型SNP所在的LD块中时,则应包括更多PC来捕获SNP信息。因果SNP。仿真研究还表明,LKM和PCA能够组合来自多个因果SNP的信息,并提供比单个SNP分析更高的功能。我们还应用LKM和PCA分析从非小细胞肺癌的实际GWAS数据集中提取的两个SNP集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号