首页> 美国卫生研究院文献>American Journal of Human Genetics >An E-M algorithm and testing strategy for multiple-locus haplotypes.
【2h】

An E-M algorithm and testing strategy for multiple-locus haplotypes.

机译:一种多位置单倍型的E-M算法和测试策略。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper gives an expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems. It permits high polymorphism and null alleles at all loci. This approach effectively deals with the primary estimation problems associated with such systems; that is, there is not a one-to-one correspondence between phenotypic and genotypic categories, and sample sizes tend to be much smaller than the number of phenotypic categories. The EM method provides maximum-likelihood estimates and therefore allows hypothesis tests using likelihood ratio statistics that have chi 2 distributions with large sample sizes. We also suggest a data resampling approach to estimate test statistic sampling distributions. The resampling approach is more computer intensive, but it is applicable to all sample sizes. A strategy to test hypotheses about aggregate groups of gametic disequilibrium coefficients is recommended. This strategy minimizes the number of necessary hypothesis tests while at the same time describing the structure of disequilibrium. These methods are applied to three unlinked dinucleotide repeat loci in Navajo Indians and to three linked HLA loci in Gila River (Pima) Indians. The likelihood functions of both data sets are shown to be maximized by the EM estimates, and the testing strategy provides a useful description of the structure of gametic disequilibrium. Following these applications, a number of simulation experiments are performed to test how well the likelihood-ratio statistic distributions are approximated by chi 2 distributions. In most circumstances the chi 2 grossly underestimated the probability of type I errors. However, at times they also overestimated the type 1 error probability. Accordingly, we recommended hypothesis tests that use the resampling method.
机译:本文给出了期望最大化(EM)算法,以获取等位基因频率,单倍型频率和多基因座系统的配子不平衡系数。它允许在所有基因座上具有较高的多态性和无效等位基因。这种方法有效地解决了与此类系统相关的主要估算问题;也就是说,表型类别和基因型类别之间不存在一一对应的关系,并且样本量往往比表型类别的数量小得多。 EM方法提供最大似然估计,因此允许使用具有chi 2分布且样本量较大的似然比统计量进行假设检验。我们还建议使用一种数据重采样方法来估计测试统计量采样分布。重采样方法需要占用大量计算机资源,但适用于所有样本量。建议采用一种策略来检验关于配子不平衡系数的集合群的假设。这种策略将必要的假设检验的数量减至最少,同时描述了不平衡的结构。这些方法应用于Navajo印第安人中的三个未连接的二核苷酸重复基因座和Gila River(Pima)印第安人中的三个连接的HLA基因座。这两个数据集的似然函数显示为通过EM估计最大化,并且测试策略提供了配子不平衡结构的有用描述。在这些应用之后,进行了许多仿真实验,以测试chi 2分布对似然比统计分布的近似程度。在大多数情况下,chi 2严重低估了I型错误的可能性。但是,有时他们也高估了类型1的错误概率。因此,我们建议使用重采样方法的假设检验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号