首页> 美国卫生研究院文献>BMC Bioinformatics >Detecting PCOS susceptibility loci from genome-wide association studies via iterative trend correlation based feature screening
【2h】

Detecting PCOS susceptibility loci from genome-wide association studies via iterative trend correlation based feature screening

机译:通过基于迭代趋势相关的特征筛选从全基因组关联研究中检测PCOS易感基因座

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Ultrahigh dimensional data with binary response and categorical features has become increasingly prevalent in various fields. Applications using such data exist in genome-wide association studies (GWAS), medical imaging, finance, text mining, among others [ , ]. The most prevailing gene selection approaches used in genome-wide association studies consider an association of each of the genetic variant using univariate models (i.e, single-SNP models); however, they evaluate the association of each SNP in isolation from the others and hence ignore combined joint effects of multi-loci [ – ]. As a matter of fact, most complex diseases are reported to be mediated through multiple genetic variants, each conferring a small or moderate effect with low penetrance, which obscures the individual significance of each variant [ , ]. Furthermore, 70−−80 of genomes showing regions of high linkage disequilibrium (LD), which is the nonrandom association of alleles at nearby loci [ – ]. Malo et al. (2008) claimed that single-SNP approaches failed to differentiate truly influential SNPs from spurious SNPs that were merely in LD with the influential SNPs [ ]. Therefore, although widely used in GWAS data analyses for its simplicity, single-SNP models have limited power and yield both high false-positive and false-negative results [ – ].
机译:具有二进制响应和分类特征的超高维数据在各个领域中越来越流行。使用此类数据的应用程序存在于全基因组关联研究(GWAS),医学成像,金融,文本挖掘等中[,]。全基因组关联研究中使用的最流行的基因选择方法是使用单变量模型(即单SNP模型)考虑每个遗传变异的关联。但是,他们评估了每个SNP彼此之间的独立性,因此忽略了多位置[–]的联合联合效应。事实上,据报道,大多数复杂的疾病是通过多种遗传变异体介导的,每种遗传变异都具有较低或较低的渗透率,从而产生了较小或中等的效果,这掩盖了每种变异体的个体重要性。此外,基因组的70--80显示了高连锁不平衡(LD)区域,这是附近基因座[-]等位基因的非随机关联。 Malo等。 (2008年)声称,单一SNP方法无法将真正具有影响力的SNP与仅存在于LD中的具有影响力SNP的伪造SNP进行区分[]。因此,尽管单SNP模型因其简单性而广泛用于GWAS数据分析中,但其功能有限,并且会产生较高的假阳性和假阴性结果[–]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号