Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data

Le Bao; Hong Gu; Katherine A Dunn; Joseph P Bielawski

首页> 外文期刊>BMC Evolutionary Biology >Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data

【24h】

Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data

机译：选择用于异质密码子进化的固定效应模型的方法，并评述其在基因和基因组数据中的应用

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

BackgroundModels of codon evolution have proven useful for investigating the strength and direction of natural selection. In some cases, a priori biological knowledge has been used successfully to model heterogeneous evolutionary dynamics among codon sites. These are called fixed-effect models, and they require that all codon sites are assigned to one of several partitions which are permitted to have independent parameters for selection pressure, evolutionary rate, transition to transversion ratio or codon frequencies. For single gene analysis, partitions might be defined according to protein tertiary structure, and for multiple gene analysis partitions might be defined according to a gene's functional category. Given a set of related fixed-effect models, the task of selecting the model that best fits the data is not trivial.ResultsIn this study, we implement a set of fixed-effect codon models which allow for different levels of heterogeneity among partitions in the substitution process. We describe strategies for selecting among these models by a backward elimination procedure, Akaike information criterion (AIC) or a corrected Akaike information criterion (AICc). We evaluate the performance of these model selection methods via a simulation study, and make several recommendations for real data analysis. Our simulation study indicates that the backward elimination procedure can provide a reliable method for model selection in this setting. We also demonstrate the utility of these models by application to a single-gene dataset partitioned according to tertiary structure (abalone sperm lysin), and a multi-gene dataset partitioned according to the functional category of the gene (flagellar-related proteins of Listeria).ConclusionFixed-effect models have advantages and disadvantages. Fixed-effect models are desirable when data partitions are known to exhibit significant heterogeneity or when a statistical test of such heterogeneity is desired. They have the disadvantage of requiring a priori knowledge for partitioning sites. We recommend: (i) selection of models by using backward elimination rather than AIC or AICc, (ii) use a stringent cut-off, e.g., p = 0.0001, and (iii) conduct sensitivity analysis of results. With thoughtful application, fixed-effect codon models should provide a useful tool for large scale multi-gene analyses.

机译：背景技术已证明密码子进化模型对于研究自然选择的强度和方向很有用。在某些情况下，先验生物学知识已成功用于模拟密码子位点之间的异质进化动力学。这些被称为固定效应模型，它们要求将所有密码子位点分配给几个分区之一，这些分区可以具有独立的参数，用于选择压力，进化速率，转化比率或密码子频率。对于单基因分析，可以根据蛋白质三级结构定义分区，对于多基因分析，可以根据基因的功能类别定义分区。给定一组相关的固定效应模型，选择最适合数据的模型并不是一件容易的事。结果在这项研究中，我们实现了一组固定效应密码子模型，该模型允许在不同的分区之间实现不同程度的异质性替代过程。我们描述了通过反向消除程序，赤池信息准则（AIC）或校正后的赤池信息准则（AICc）在这些模型中进行选择的策略。我们通过模拟研究评估这些模型选择方法的性能，并为实际数据分析提出一些建议。我们的仿真研究表明，在这种情况下，后向消除程序可以为模型选择提供可靠的方法。通过应用到根据三级结构划分的单基因数据集（鲍鱼精子溶素）和根据基因的功能类别划分的多基因数据集（李斯特菌鞭毛相关蛋白），我们还展示了这些模型的实用性结论固定效应模型具有优缺点。当已知数据分区表现出显着的异质性或需要进行这种异质性的统计检验时，固定效果模型是理想的。它们的缺点是需要先验知识来划分站点。我们建议：（i）使用后向消除而不是AIC或AICc选择模型，（ii）使用严格的临界值，例如p = 0.0001，并且（iii）对结果进行敏感性分析。经过深思熟虑的应用，固定效果密码子模型应该为大规模多基因分析提供有用的工具。

著录项

来源
《BMC Evolutionary Biology》 |2007年第1期|共页
作者
Le Bao; Hong Gu; Katherine A Dunn; Joseph P Bielawski;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类普通生物学;
关键词

相似文献

外文文献
中文文献
专利

1. Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies [J] . Haohan Wang, Bryon Aragam, Eric P. Xing Methods: A Companion to Methods in Enzymology . 2018,第期

机译：异构数据集中的可变选择：截断级别稀疏线性混合模型，具有基因组关联研究
2. Codon-Substitution Models to Detect Adaptive Evolution that Account for Heterogeneous Selective Pressures Among Site Classes [J] . Ziheng Yang, Willie J. Swanson Molecular biology and evolution . 2002,第1期

机译：用于检测适应进化的密码子替代模型，该进化解释了站点类别之间的异构选择压力
3. Incorporating a multiple discrete-continuous outcome in the generalized heterogeneous data model: Application to residential self-selection effects analysis in an activity time-use behavior model [J] . Bhat Chandra R., Astroza Sebastian, Bhat Aarti C., Transportation research . 2016,第sepa期

机译：在广义异构数据模型中纳入多个离散连续的结果：在活动时间使用行为模型中应用于住宅自选效应分析
4. Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies [C] . Haohan Wang, Bryon Aragam, Eric P. Xing IEEE International Conference on Bioinformatics and Biomedicine . 2017

机译：异构数据集中的变量选择：缩短秩稀疏线性混合模型及其在全基因组关联研究中的应用
5. Estimating and modelling rates of evolution with applications to phylogenetics and codon selection [D] . Bevan, Rachel Bronwen. 2007

机译：估计和模拟进化速率，并将其应用于系统发育和密码子选择
6. Methods for selecting fixed-effect models for heterogeneous codon evolution with comments on their application to gene and genome data [O] . Le Bao, Hong Gu, Katherine A Dunn, 2007

机译：选择用于异源密码子进化的固定效应模型的方法并评述其在基因和基因组数据中的应用
7. Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data [O] . 2007

机译：选择用于异源密码子进化的固定效应模型的方法，并评述其在基因和基因组数据中的应用

Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data

摘要

著录项

相似文献

相关主题

期刊订阅