首页> 外文学位 >Corrected Log Det evolutionary distance estimation.
【24h】

Corrected Log Det evolutionary distance estimation.

机译:校正了Log Det进化距离估计。

获取原文
获取原文并翻译 | 示例

摘要

In this thesis, we will be interested in the use of distance methods to reconstruct evolutionary trees with a focus on LogDet distances. The LogDet estimator is a measure of divergence (evolutionary distance) between sequences of biological characters: DNA amino acids, or gene content data. This transformation is useful in comparison with many existing distances which tend to falsely group sequences on the basis of their similar nucleotide composition. However, a difficulty is that LogDet distance does not exist when the determinant is less than or equal to zero.;Examining the proportions of times the estimated topology was the same as the true topology we found that LogDet distance can be used to accurately reconstruct the true evolutionary trees in many situations. However, the corrected distance performed better since it dealt with the problem of non-existence.;We introduce a corrected LogDet distance with a correction factor alpha. With appropriate values of alpha, we can decrease the proportion of non-existent distances substantially. There is a tradeoff between choosing a to minimize the probability of non-positive distance and to minimize the MSE of the distance. We find optimal a values that minimize the MSE of the distance estimator and analyze its performance in decreasing the probability of non-existence. We also briefly introduce methods that can be used to estimate the edge lengths and the topology. We use the estimated edge lengths calculated with LogDet and corrected LogDet distances to estimate trees in four-taxon simulations.
机译:在本文中,我们将对使用距离方法来重建进化树感兴趣,并着重于LogDet距离。 LogDet估算器是生物特征序列(DNA氨基酸或基因含量数据)之间差异(进化距离)的量度。与许多现有距离相比,这种转换是有用的,这些距离往往会基于其相似的核苷酸组成对序列进行错误分组。然而,一个困难是当行列式小于或等于零时,LogDet距离不存在。检查估计拓扑与真实拓扑的时间比例,我们发现LogDet距离可用于准确地重建在许多情况下都是真正的进化树。但是,校正后的距离表现出更好的性能,因为它解决了不存在的问题。我们引入了校正因子为alpha的校正后LogDet距离。使用适当的alpha值,我们可以大幅减少不存在的距离的比例。在选择a以最小化非正距离的可能性和最小化距离的MSE之间存在一个权衡。我们找到了一个最佳的值,该值可以最小化距离估计器的MSE,并分析其在降低不存在概率方面的性能。我们还简要介绍了可用于估计边长和拓扑的方法。我们使用通过LogDet计算的估计边缘长度和校正后的LogDet距离来估计四类分类中的树。

著录项

  • 作者

    Gao, He.;

  • 作者单位

    Dalhousie University (Canada).;

  • 授予单位 Dalhousie University (Canada).;
  • 学科 Applied Mathematics.;Biology Bioinformatics.;Biology Evolution and Development.
  • 学位 M.Sc.
  • 年度 2009
  • 页码 47 p.
  • 总页数 47
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 非洲史;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号