SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics

Alexis Criscuolo; Vincent Berry; Emmanuel J. P. Douzery; and Olivier Gascuel

首页> 外文期刊>Systematic Biology >SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics

【24h】

SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics

机译：SDM：一种基于快速距离的（超级）树构建系统学方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Phylogenomic studies aim to build phylogenies from large sets of homologous genes. Such “genome-sized” data require fast methods, because of the typically large numbers of taxa examined. In this framework, distance-based methods are useful for exploratory studies and building a starting tree to be refined by a more powerful maximum likelihood (ML) approach. However, estimating evolutionary distances directly from concatenated genes gives poor topological signal as genes evolve at different rates. We propose a novel method, named super distance matrix (SDM), which follows the same line as average consensus supertree (ACS; Lapointe and Cucumel, 1997) and combines the evolutionary distances obtained from each gene into a single distance supermatrix to be analyzed using a standard distance-based algorithm. SDM deforms the source matrices, without modifying their topological message, to bring them as close as possible to each other; these deformed matrices are then averaged to obtain the distance supermatrix. We show that this problem is equivalent to the minimization of a least-squares criterion subject to linear constraints. This problem has a unique solution which is obtained by resolving a linear system. As this system is sparse, its practical resolution requires O(na ka) time, where n is the number of taxa, k the number of matrices, and a < 2, which allows the distance supermatrix to be quickly obtained. Several uses of SDM are proposed, from fast exploratory studies to more accurate approaches requiring heavier computing time. Using simulations, we show that SDM is a relevant alternative to the standard matrix representation with parsimony (MRP) method, notably when the taxa sets of the different genes have low overlap. We also show that SDM can be used to build an excellent starting tree for an ML approach, which both reduces the computing time and increases the topogical accuracy. We use SDM to analyze the data set of Gatesy et al. (2002, Syst. Biol. 51: 652–664) that involves 48 genes of 75 placental mammals. The results indicate that these genes have strong rate heterogeneity and confirm the simulation conclusions.

机译：系统生物学研究旨在从大量同源基因中建立系统发育。由于通常要检查大量的分类单元，因此此类“基因组大小”的数据需要快速的方法。在此框架中，基于距离的方法可用于探索性研究和构建起始树，并通过更强大的最大似然（ML）方法加以完善。但是，直接估计级联基因的进化距离会产生不良的拓扑信号，因为基因以不同的速率进化。我们提出了一种称为超距离矩阵（SDM）的新方法，该方法与平均共识超树（ACS; Lapointe and Cucumel，1997）遵循同一条线，并将从每个基因获得的进化距离组合为一个距离超矩阵，以使用一种基于距离的标准算法。 SDM使源矩阵变形，而不修改其拓扑消息，以使它们彼此尽可能接近。然后，对这些变形的矩阵求平均，以获得距离超矩阵。我们证明这个问题等同于最小二乘准则在线性约束下的最小化。这个问题有一个独特的解决方案，它是通过解决线性系统而获得的。由于该系统稀疏，因此其实际解析度需要O（n a k a ）时间，其中n是分类单元数，k是矩阵数，并且< 2，可以快速获得距离超矩阵。提出了SDM的几种用途，从快速的探索性研究到需要大量计算时间的更精确方法。通过仿真，我们显示SDM是用简约（MRP）方法替代标准矩阵表示的一种相关替代方法，尤其是当不同基因的分类单元集具有低重叠时。我们还表明，SDM可用于为ML方法构建出色的起始树，这不仅减少了计算时间，而且提高了拓扑准确性。我们使用SDM分析Gatesy等人的数据集。（2002，Syst.Biol.51：652-664），涉及75个胎盘哺乳动物的48个基因。结果表明，这些基因具有很强的速率异质性，并证实了仿真结论。

著录项

来源
《Systematic Biology》 |2006年第5期|740-755|共16页
作者
Alexis Criscuolo; Vincent Berry; Emmanuel J. P. Douzery; and Olivier Gascuel;
展开▼
作者单位

Groupe Phylogénie Moléculaire ISEM Université Montpellier 2 CC 064 34095 Montpellier Cedex 05 France;

Equipe Méthodes et Algorithmes pour la Bioinformatique LIRMM (CNRS Université Montpellier 2) 161 rue Ada 34392 Montpellier Cedex 05 France E-mail: gascuel{at}lirmm.fr (O.G.);

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics [J] . Criscuolo A, Berry V, Douzery EJ, Systematic Biology . 2006,第5期

机译：SDM：基于快速距离的系统构建（超级）树学方法
2. Family-joining: A fast distance-based method for constructing generally labeled trees (vol 33, pg 2720, 2020) [J] . Kalaghatgi Prabhav, Pfeifer Nico, Lengauer Thomas Molecular biology and evolution . 2020,第6期

机译：家庭加入：一种基于快速的基于距离的构造距离标记的树木（Vol 33，PG 2720,2020）
3. A fast alignment-free bioinformatics procedure to infer accurate distance-based phylogenetic trees from genome assemblies [J] . Alexis Criscuolo Research Ideas and Outcomes . 2019,第8期

机译：一种快速，无需比对的生物信息学程序，可从基因组装配中推断出基于距离的准确系统发育树
4. Building Phylogenomic Tree With N-gram Contrast Value Vector [C] . Goh Yong Kheng, Lim Foo Weng, Leo Yean Ling ICMSS 2013 . 2013

机译：建立与n克对比值矢量的系统染色树
5. Building and maintaining relationships between superintendents and school board members: The approach of two public school superintendents. [D] . Jackson, Lamont Alexander. 2016

机译：建立和维护校长与学校董事会成员之间的关系：两名公立学校校长的做法。
6. Family-Joining: A Fast Distance-Based Method for Constructing Generally Labeled Trees [O] . Prabhav Kalaghatgi, Nico Pfeifer, Thomas Lengauer -1

机译：Family-Joining：一种基于快速距离的方法来构造一般标记的树
7. SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics [O] . Alexis Criscuolo, Vincent Berry, Emmanuel J. P. Douzery, 2008

机译：sDm：基于快速距离的植物基因组学（超）树构建方法

SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics

摘要

著录项

相似文献

相关主题

期刊订阅