首页> 外文期刊>BMC Evolutionary Biology >Topology testing of phylogenies using least squares methods
【24h】

Topology testing of phylogenies using least squares methods

机译:使用最小二乘法对系统发育进行拓扑测试

获取原文

摘要

Background The least squares (LS) method for constructing confidence sets of trees is closely related to LS tree building methods, in which the goodness of fit of the distances measured on the tree (patristic distances) to the observed distances between taxa is the criterion used for selecting the best topology. The generalized LS (GLS) method for topology testing is often frustrated by the computational difficulties in calculating the covariance matrix and its inverse, which in practice requires approximations. The weighted LS (WLS) allows for a more efficient albeit approximate calculation of the test statistic by ignoring the covariances between the distances. Results The goal of this paper is to assess the applicability of the LS approach for constructing confidence sets of trees. We show that the approximations inherent to the WLS method did not affect negatively the accuracy and reliability of the test both in the analysis of biological sequences and DNA-DNA hybridization data (for which character-based testing methods cannot be used). On the other hand, we report several problems for the GLS method, at least for the available implementation. For many data sets of biological sequences, the GLS statistic could not be calculated. For some data sets for which it could, the GLS method included all the possible trees in the confidence set despite a strong phylogenetic signal in the data. Finally, contrary to WLS, for simulated sequences GLS showed undercoverage (frequent non-inclusion of the true tree in the confidence set). Conclusion The WLS method provides a computationally efficient approximation to the GLS useful especially in exploratory analyses of confidence sets of trees, when assessing the phylogenetic signal in the data, and when other methods are not available.
机译:背景技术最小二乘(LS)构造树的置信度集的方法与LS树木构建方法密切相关,在该方法中,使用树上测量的距离(爱国距离)与观测到的分类单元之间距离的拟合优度为标准选择最佳的拓扑。用于拓扑测试的通用LS(GLS)方法通常会因计算协方差矩阵及其逆的计算困难而沮丧不已,而在实践中这需要近似值。尽管忽略了距离之间的协方差,但是加权LS(WLS)允许更高效地进行测试统计量的近似计算。结果本文的目的是评估LS方法在构建树的置信度集方面的适用性。我们显示,在生物学序列分析和DNA-DNA杂交数据(不能使用基于字符的测试方法)中,WLS方法固有的近似值不会对测试的准确性和可靠性产生负面影响。另一方面,我们报告了GLS方法的一些问题,至少对于可用的实现而言。对于许多生物序列数据集,无法计算GLS统计信息。对于可能的某些数据集,尽管数据中有很强的系统发育信号,但GLS方法仍将所有可能的树包括在置信集中。最后,与WLS相反,对于模拟序列,GLS显示覆盖不足(可信树中不经常包含真实树)。结论WLS方法提供了GLS的有效计算近似值,特别适用于对树的置信度集进行探索性分析,评估数据中的系统发生信号以及其他方法不可用时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号