首页> 外文期刊>Systematic Biology >Assessing Parameter Identifiability in Phylogenetic Models Using Data Cloning
【24h】

Assessing Parameter Identifiability in Phylogenetic Models Using Data Cloning

机译:使用数据克隆评估系统发育模型中的参数可识别性

获取原文
获取原文并翻译 | 示例
           

摘要

The success of model-based methods in phylogenetics has motivated much research aimed at generating new, biologically informative models. This new computer-intensive approach to phylogenetics demands validation studies and sound measures of performance. To date there has been little practical guidance available as to when and why the parameters in a particular model can be identified reliably. Here, we illustrate how Data Cloning (DC), a recently developed methodology to compute the maximum likelihood estimates along with their asymptotic variance, can be used to diagnose structural parameter nonidentifiability (NI) and distinguish it from other parameter estimability problems, including when parameters are structurally identifiable, but are not estimable in a given data set (INE), and when parameters are identifiable, and estimable, but only weakly so (WE). The application of the DC theorem uses well-known and widely used Bayesian computational techniques. With the DC approach, practitioners can use Bayesian phylogenetics software to diagnose nonidentifiability. Theoreticians and practitioners alike now have a powerful, yet simple tool to detect nonidentifiability while investigating complex modeling scenarios, where getting closed-form expressions in a probabilistic study is complicated. Furthermore, here we also show how DC can be used as a tool to examine and eliminate the influence of the priors, in particular if the process of prior elicitation is not straightforward. Finally, when applied to phylogenetic inference, DC can be used to study at least two important statistical questions: assessing identifiability of discrete parameters, like the tree topology, and developing efficient sampling methods for computationally expensive posterior densities.
机译:基于模型的方法在系统发育学上的成功激发了许多旨在生成新的生物学信息模型的研究。这种新的计算机密集型系统发育方法需要进行验证研究和性能的合理衡量。迄今为止,关于何时以及为什么可以可靠地识别特定模型中的参数的实践指南很少。在这里,我们说明如何使用数据克隆(DC)(一种最近开发的方法来计算最大似然估计值及其渐近方差)来诊断结构参数不可识别性(NI),并将其与其他参数可估计性问题(包括何时使用参数)区分开来在结构上是可识别的,但在给定的数据集(INE)中是不可估计的,并且在参数可识别且可估计的情况下,但在微弱的情况下(WE)是不可估计的。 DC定理的应用使用了广为人知的贝叶斯计算技术。使用DC方法,从业人员可以使用贝叶斯系统进化软件来诊断不可识别性。现在,理论家和从业人员都拥有强大而简单的工具,可以在研究复杂的建模场景时检测不可识别性,而在概率研究中获取封闭形式的表达式非常复杂。此外,在这里,我们还展示了如何将DC用作检查和消除先验影响的工具,特别是在先验诱导过程不是很简单的情况下。最后,在应用于系统发育推断时,DC可以用于研究至少两个重要的统计问题:评估离散参数(如树形拓扑)的可识别性,以及开发用于计算后验密度的有效采样方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号