首页> 外文学位 >Robust statistical methods for estimating the distance and the rate of change among molecular sequences sampled from rapidly-evolving organisms.
【24h】

Robust statistical methods for estimating the distance and the rate of change among molecular sequences sampled from rapidly-evolving organisms.

机译:可靠的统计方法,用于估计从快速发展的生物中采样的分子序列之间的距离和变化速率。

获取原文
获取原文并翻译 | 示例

摘要

Quantifying the changes observed between molecular sequences and measuring the change rate are areas of vigorous interest for statistical methodological development, important to systematics, virology and evolutionary biology. In this dissertation, I focus on two inference methods that nicely complement each other in achieving these ends. The first, Tree and Rate Estimation by Local Evaluation (TREBLE), is a general method for estimating the rate of molecular sequence substitution over a set of time-sampled taxa using only a matrix of pairwise distances and a vector of sampling times. TREBLE proves both incredibly computationally efficient and highly effective for estimating the rate of substitution. The second method introduces robust counting for labeled distances. Robust counting provides an estimation procedure with strong protection against bias arising from underlying model misspecification, a rampant problem in biological sequence analysis. The approach also proffers a general understanding of labeled distances, where continuous-time Markov-chain transitions of interest form only a subset of all possible state changes. Finally, the method introduces a general approach for generating CTMC codon models from simple nucleotide models, avoiding the pitfall of introducing additional, computationally expensive parameter estimates to infer sequence substitutions in codon space. Used in combination, TREBLE and robust counting, permit researchers to estimate distances based on substitutions types motivated by their biological inquiries and then immediately turn around to estimate the rate at which these processes occur, having statistical confidence in both quantities.
机译:量化观察到的分子序列之间的变化并测量变化率是统计学方法学发展的重要方向,对系统学,病毒学和进化生物学很重要。在本文中,我重点介绍了两种推理方法,它们在实现这些目标时可以很好地相互补充。第一种是通过局部评估的树和速率估计(TREBLE),是一种仅使用成对距离矩阵和采样时间向量来估计一组时间采样分类单元上分子序列取代率的通用方法。 TREBLE证明了在计算替代率方面的计算效率和高效性。第二种方法引入了针对标记距离的鲁棒计数。稳健的计数为估算程序提供了强大的保护,可防止潜在的模型错误指定(生物学序列分析中的普遍问题)引起的偏差。该方法还提供了对标记距离的一般理解,其中感兴趣的连续时间马尔可夫链跃迁仅构成所有可能状态变化的子集。最后,该方法引入了一种从简单核苷酸模型生成CTMC密码子模型的通用方法,避免了引入额外的,计算上昂贵的参数估计来推断密码子空间中的序列替换的陷阱。结合使用TREBLE和鲁棒计数,研究人员可以根据生物学查询所激发的替代类型来估算距离,然后立即转而估算这些过程的发生率,并且对这两个数量都具有统计上的信心。

著录项

  • 作者

    O'Brien, John David.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Bioinformatics.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 180 p.
  • 总页数 180
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号