首页>
外文学位
>Robust statistical methods for estimating the distance and the rate of change among molecular sequences sampled from rapidly-evolving organisms.
【24h】
Robust statistical methods for estimating the distance and the rate of change among molecular sequences sampled from rapidly-evolving organisms.
Quantifying the changes observed between molecular sequences and measuring the change rate are areas of vigorous interest for statistical methodological development, important to systematics, virology and evolutionary biology. In this dissertation, I focus on two inference methods that nicely complement each other in achieving these ends. The first, Tree and Rate Estimation by Local Evaluation (TREBLE), is a general method for estimating the rate of molecular sequence substitution over a set of time-sampled taxa using only a matrix of pairwise distances and a vector of sampling times. TREBLE proves both incredibly computationally efficient and highly effective for estimating the rate of substitution. The second method introduces robust counting for labeled distances. Robust counting provides an estimation procedure with strong protection against bias arising from underlying model misspecification, a rampant problem in biological sequence analysis. The approach also proffers a general understanding of labeled distances, where continuous-time Markov-chain transitions of interest form only a subset of all possible state changes. Finally, the method introduces a general approach for generating CTMC codon models from simple nucleotide models, avoiding the pitfall of introducing additional, computationally expensive parameter estimates to infer sequence substitutions in codon space. Used in combination, TREBLE and robust counting, permit researchers to estimate distances based on substitutions types motivated by their biological inquiries and then immediately turn around to estimate the rate at which these processes occur, having statistical confidence in both quantities.
展开▼