首页> 外文期刊>Human Heredity >Using parametric multipoint lods and mods for linkage analysis requires a shift in statistical thinking
【24h】

Using parametric multipoint lods and mods for linkage analysis requires a shift in statistical thinking

机译:使用参数多点lod和mod进行链接分析需要改变统计思维

获取原文
获取原文并翻译 | 示例
           

摘要

Multipoint (MP) linkage analysis represents a valuable tool for whole-genome studies but suffers from the disadvantage that its probability distribution is unknown and varies as a function of marker information and density, genetic model, number and structure of pedigrees, and the affection status distribution [Xing and Elston: Genet Epidemiol 2006;30:447-458; Hodge et al.: Genet Epidemiol 2008;32:800-815]. This implies that the MP significance criterion can differ for each marker and each dataset, and this fact makes planning and evaluation of MP linkage studies difficult. One way to circumvent this difficulty is to use simulations or permutation testing. Another approach is to use an alternative statistical paradigm to assess the statistical evidence for linkage, one that does not require computation of a p value. Here we show how to use the evidential statistical paradigm for planning, conducting, and interpreting MP linkage studies when the disease model is known (lod analysis) or unknown (mod analysis). As a key feature, the evidential paradigm decouples uncertainty (i.e. error probabilities) from statistical evidence. In the planning stage, the user calculates error probabilities, as functions of one's design choices (sample size, choice of alternative hypothesis, choice of likelihood ratio (LR) criterion k) in order to ensure a reliable study design. In the data analysis stage one no longer pays attention to those error probabilities. In this stage, one calculates the LR for two simple hypotheses (i.e. trait locus is unlinked vs. trait locus is located at a particular position) as a function of the parameter of interest (position). The LR directly measures the strength of evidence for linkage in a given data set and remains completely divorced from the error probabilities calculated in the planning stage. An important consequence of this procedure is that one can use the same criterion k for all analyses. This contrasts with the situation described above, in which the value one uses to conclude significance may differ for each marker and each dataset in order to accommodate a fixed test size, α. In this study we accomplish two goals that lead to a general algorithm for conducting evidential MP linkage studies. (1) We provide two theoretical results that translate into guidelines for investigators conducting evidential MP linkage: (a) Comparing mods to lods, error rates (including probabilities of weak evidence) are generally higher for mods when the null hypothesis is true, but lower for mods in the presence of true linkage. Royall [J Am Stat Assoc 2000;95:760-780] has shown that errors based on lods are bounded and generally small. Therefore when the true disease model is unknown and one chooses to use mods, one needs to control misleading evidence rates only under the null hypothesis; (b) for any given pair of contiguous marker loci, error rates under the null are greatest at the midpoint between the markers spaced furthest apart, which provides an obvious simple alternative hypothesis to specify for planning MP linkage studies. (2) We demonstrate through extensive simulation that this evidential approach can yield low error rates under the null and alternative hypotheses for both lods and mods, despite the fact that mod scores are not true LRs. Using these results we provide a coherent approach to implement a MP linkage study using the evidential paradigm.
机译:多点(MP)连锁分析是进行全基因组研究的一种有价值的工具,但缺点是其概率分布是未知的,并且会随着标记信息和密度,遗传模型,谱系数目和结构以及患病状况而变化分布[Xing and Elston:Genet Epidemiol 2006; 30:447-458; Hodge等:Genet Epidemiol 2008; 32:800-815]。这意味着MP重要性标准对于每个标记和每个数据集可能有所不同,并且这一事实使MP连锁研究的计划和评估变得困难。避免此困难的一种方法是使用模拟或置换测试。另一种方法是使用替代统计范式来评估用于链接的统计证据,该方法不需要计算p值。在这里,我们展示了当疾病模型是已知的(lod分析)或未知的(mod分析)时,如何使用证据统计范式来计划,进行和解释MP连锁研究。作为关键特征,证据范式将不确定性(即错误概率)与统计证据分离。在计划阶段,用户根据设计选择(样本大小,替代假设的选择,似然比(LR)标准k的选择)的函数来计算错误概率,以确保可靠的研究设计。在数据分析阶段,人们不再关注那些错误概率。在这一阶段,根据感兴趣的参数(位置)计算两个简单假设(即特征位点未关联而特征位点位于特定位置)的LR。 LR直接测量给定数据集中链接的证据强度,并且与计划阶段计算出的错误概率完全脱离。此过程的重要结果是,可以对所有分析使用相同的标准k。这与上述情况形成对比,在上述情况下,一个值用于得出显着性的值可能会针对每个标记和每个数据集而有所不同,以适应固定的测试大小α。在这项研究中,我们完成了两个目标,从而得出了进行证据MP关联研究的通用算法。 (1)我们提供了两个理论结果,这些结果转化为进行证据MP链接的研究者的指南:(a)比较mods与los,当原假设为真时,mods的错误率(包括弱证据的概率)通常较高,但较低在存在真正联系的情况下用于mods。 Royall [J Am Stat Assoc 2000; 95:760-780]表明,基于lods的错误是有界的,并且通常很小。因此,当真正的疾病模型是未知的并且选择使用mods时,一个人仅在无效假设下才需要控制误导的证据发生率。 (b)对于任何给定的连续标记位点对,零位错误率在距离最远的标记之间的中点处最大,这为指定MP连锁研究提供了一个显而易见的简单替代假设。 (2)我们通过广泛的模拟证明,尽管mod评分不是真正的LR,但在原假设和替代假设下,对于lo和mod,这种证据方法都能产生较低的错误率。利用这些结果,我们提供了一种连贯的方法,以使用证据范式来实施MP链接研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号