首页> 外文会议>IEEE International Symposium on Bioinformatics and Bioengineering >Phylogeny By Top Down Clustering Using a Given Multiple Alignment
【24h】

Phylogeny By Top Down Clustering Using a Given Multiple Alignment

机译:通过给定的多个对准通过顶部下降聚类来Phylogy

获取原文
获取外文期刊封面目录资料

摘要

We present a new phylogenetic tree construction algorithm. Our algorithm takes as input an alignment of multiple sequences. We assume that the sequences in the given alignment are closely related by evolution, and that the given alignment is biologically correct. Our approach has some similarities with the well-known clustering procedure UPGMA in that we define the distance between two clusters similarly, and we can build the tree using vertical distances to denote evolutionary distances. However, our approach is fundamentally different in choosing the clusters since it works from top to bottom. We select a pair of clusters with the maximum distance first, and then recursively create clusters for each. Since trying all possible clusterings is impractical, as an approximation to this approach, we examine all conserved segments in the given multiple sequence alignment, and for each we try two clusters: one in which the conserved region is identical in each sequence in this cluster, and the cluster composed of the rest of the sequences. Our main hypothesis is that the top-down approach will perform better compared to a bottom-up UPGMA method, in separating large cluster pairs from each other, and this will not prevent very close sequences from appearing in near leaves. We tested our algorithm using a multiple sequence alignment given in a recent paper that builds a phylogenetic tree of amino acid sequences from the Runt domain of RUNX genes from a set of species using existing techniques. Our tree has a few differences compared to this tree. We believe that our approach provides alternative valuable information for phytogeny.
机译:我们提出了一种新的系统发育树施工算法。我们的算法作为输入的对齐多个序列。我们假设给定的对齐中的序列通过演化密切相关,并且给定的对齐在生物学上是正确的。我们的方法与众所周知的聚类程序Upgma具有一些相似之处,因为我们类似地定义了两个集群之间的距离,我们可以使用垂直距离构建树,以表示进化距离。然而,我们的方法在选择群集时基本不同,因为它从上到下工作。我们首先选择一对具有最大距离的群集,然后递归地为每个群体创建群集。由于尝试所有可能的群集是不切实际的,因为对这种方法的近似,我们在给定的多个序列对齐中检查所有保守的段,以及每个我们尝试两个集群:在此集群中每个序列中保守区域相同的群体,并且集群由其余序列组成。我们的主要假设是,与自下而上的UPGMA方法相比,自上而下的方法将更好地执行,在彼此分离大型簇对中,这不会阻止在叶子附近出现非常近的序列。我们使用在最近的纸张中给出的多个序列对准测试了算法,其使用现有技术从Runx基因的Runt结构域构建氨基酸序列的系统发育树。与这棵树相比,我们的树有几个差异。我们认为,我们的方法为植物产生了替代宝贵的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号