Searching for Convergence in Phylogenetic Markov Chain Monte Carlo

Robert G. Beiko; Jonathan M. Keith; Timothy J. Harlow; and Mark A. Ragan

首页> 外文期刊>Systematic Biology >Searching for Convergence in Phylogenetic Markov Chain Monte Carlo

【24h】

Searching for Convergence in Phylogenetic Markov Chain Monte Carlo

机译：在系统发生马尔可夫链蒙特卡罗中寻找收敛

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, δ and ε, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a “metachain” to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely.

机译：马尔可夫链蒙特卡罗（MCMC）是一种在种系学界得到广泛使用的方法，并且是诸如MrBayes之类的种系学软件包的核心。 MCMC方法用户的一个重要问题是如何为可调参数选择合适的值，例如一个或多个马尔可夫链的长度，采样密度，建议机制，以及如果使用大都会耦合的MCMC，加热链及其温度。尽管某些参数设置已在文献中进行了详细检查，但在选择其他参数时通常会更多考虑计算时间或其他数据集的个人经验。这样的选择可能导致对树空间的采样不足或对计算资源的利用不充分。我们对70种具有不同大小和分类组成的随机选择的，直系同源蛋白质集进行了融合和混合的详细研究。来自多个随机起点的重复运行可以对收敛性进行更严格的评估，为此，我们开发了两个新颖的统计量δ和ε。尽管似然值总是很快稳定下来，但是对树形拓扑的后验分布进行足够的采样花费了相当长的时间。我们的结果表明，多态性对于具有30个或更多分类单元的数据集是常见的，并且这会导致收敛和混合缓慢。但是，我们还发现，务实的方法是将来自几个短的，重复的运行的数据组合成一个“元链”，以估计二分后验概率，这种方法提供了很好的近似值，并且在近似参考后验分布方面，这种估计并不比使用后验概率获得的估计差。与元链长度相同的单个长期运行。当加热的马尔可夫链温度较低时，精度似乎是最好的，而高温的链似乎很少采样具有高后验概率的树木。

著录项

来源
《Systematic Biology》 |2006年第4期|553-565|共13页
作者
Robert G. Beiko; Jonathan M. Keith; Timothy J. Harlow; and Mark A. Ragan;
展开▼
作者单位

ARC Centre in Bioinformatics and Institute for Molecular Bioscience The University of Queensland Brisbane Queensland 4072 Australia E-mail: r.beiko{at}gmail.com (R.G.B.) and ARC Centre in Bioinformatics;

Department of Mathematics The University of Queensland Brisbane Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Searching for convergence in phylogenetic Markov chain Monte Carlo [J] . Beiko RG, Keith JM, Harlow TJ, Systematic Biology . 2006,第4期

机译：在系统发育马尔可夫链蒙特卡洛中寻找收敛
2. FAST CONVERGENCE OF MARKOV CHAIN MONTE CARLO ALGORITHMS FOR PHYLOGENETIC RECONSTRUCTION WITH HOMOGENEOUS DATA ON CLOSELY RELATED SPECIES [J] . DANIEL STEFANKOVIC, ERIC VIGODA SIAM Journal on Discrete Mathematics . 2011,第3a4期

机译：亲缘关系密切的物种均质数据系统发育的马尔可夫链蒙特卡罗算法的快速收敛
3. The Convergence of Markov Chain Monte Carlo Methods: From the Metropolis Method to Hamiltonian Monte Carlo [J] . Michael Betancourt Annalen der Physik . 2019,第3期

机译：Markov Chain Monte Carlo方法的收敛性：从大都会方法到哈密顿蒙特卡罗
4. On the Markov Chain Monte Carlo Convergence Diagnostic of Bayesian Bernoulli Mixture Regression Model for Bidikmisi Scholarship Classification [C] . Nur Iriawan, Kartika Fithriasari, Brodjol Sutijo Suprih Ulama, International Conference on Computing, Mathematics and Statistics . 2019

机译：论Bidikmisi奖学金分类的Bayesian Bernoulli混合回归模型的马尔可夫链蒙特卡罗融合模型
5. Bayesian Markov chain Monte Carlo phylogenetic analysis of mammalian evolution reveals varying substitution patterns along the sequence and across lineages [D] . Hwang, Dick G. 2005

机译：贝叶斯马尔可夫链蒙特卡罗对哺乳动物进化的系统进化分析揭示了沿序列和跨谱系变化的替代模式
6. Searching for efficient Markov chain Monte Carlo proposal kernels [O] . Ziheng Yang, Carlos E. Rodríguez 2013

机译：搜索有效的马尔可夫链蒙特卡罗提案内核
7. FAST CONVERGENCE OF MARKOV CHAIN MONTE CARLO ALGORITHMS FOR PHYLOGENETIC RECONSTRUCTION WITH HOMOGENEOUS DATA ON CLOSELY RELATED SPECIES [O] . Daniel Štefankovič, Eric Vigoda 2011

机译：紧密相关物种的均质数据马可夫链蒙特卡罗算法在系统发育重建中的快速收敛性

Searching for Convergence in Phylogenetic Markov Chain Monte Carlo

摘要

著录项

相似文献

相关主题

期刊订阅