首页> 美国卫生研究院文献>other >Examining the effect of linkage disequilibrium between markers on the Type I error rate and power of nonparametric multipoint linkage analysis of two-generation and multigenerational pedigrees in the presence of missing genotype data
【2h】

Examining the effect of linkage disequilibrium between markers on the Type I error rate and power of nonparametric multipoint linkage analysis of two-generation and multigenerational pedigrees in the presence of missing genotype data

机译:在缺少基因型数据的情况下检查标记之间的连锁不平衡对两代和多代谱系的I型错误率和非参数多点连锁分析能力的影响

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Since most multipoint linkage analysis programs currently assume linkage equilibrium (LE) between markers when inferring parental haplotypes, ignoring linkage disequilibrium (LD) may inflate the Type I error rate. We investigated the effect of LD on the Type I error rate and power of nonparametric multipoint linkage analysis of two-generation and multigenerational multiplex families. Using genome wide single nucleotide polymorphism (SNP) data from the Collaborative Study of the Genetics of Alcoholism (COGA), we modified the original dataset into 30 total data sets in order to consider 6 different patterns of missing data for 5 different levels of SNP density. To assess power, we designed simulated traits based on existing marker genotypes. For the Type I error rate, we simulated 1,000 qualitative traits from random distributions, unlinked to any of the marker data. Overall, the different levels of SNP density examined here had only small effects on power (except sibpair data). Missing data had a substantial effect on power, with more completely genotyped pedigrees yielding the highest power (except sibpair data). Most of the missing data patterns did not cause large increases in the Type I error rate if the SNP markers were more than 0.3 cM apart. However, in a dense 0.25 cM map, removing genotypes on founders and/or founders and parents in the middle generation caused substantial inflation of the Type I error rate, which corresponded to the increasing proportion of persons with missing data. Results also showed that long high-LD blocks have severe effects on Type I error rates.
机译:由于目前大多数多点连锁分析程序在推断父母单倍型时都假设标记之间的连锁平衡(LE),因此忽略连锁不平衡(LD)可能会增加I型错误率。我们调查了LD对两代和多代多路复用族的I型错误率和非参数多点链接分析的影响。使用来自酒精中毒遗传学合作研究(COGA)的全基因组单核苷酸多态性(SNP)数据,我们将原始数据集修改为30个总数据集,以考虑5种不同SNP密度水平下缺失数据的6种不同模式。为了评估能力,我们基于现有标记基因型设计了模拟性状。对于I型错误率,我们从随机分布中模拟了1,000个定性特征,这些特征与任何标记数据都没有关联。总体而言,此处检查的不同水平的SNP密度对功率的影响很小(同胞对数据除外)。缺失数据对能量有很大影响,基因型谱系越完整,产生的能量越高(同胞对数据除外)。如果SNP标记相距超过0.3 cM,则大多数丢失的数据模式不会导致I型错误率大幅度增加。然而,在密集的0.25 cM图谱中,去除中代的创始人和/或创始人和父母的基因型会导致I型错误率的大幅上升,这与缺少数据的人所占的比例越来越高有关。结果还表明,长的高LD阻滞对I型错误率有严重影响。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号