首页> 外文期刊>Molecular Biology and Evolution >Integrating Markov Clustering and Molecular Phylogenetics to Reconstruct the Cyanobacterial Species Tree from Conserved Protein Families
【24h】

Integrating Markov Clustering and Molecular Phylogenetics to Reconstruct the Cyanobacterial Species Tree from Conserved Protein Families

机译:整合马尔可夫聚类和分子系统发生学,从保守的蛋白质家族中重建蓝细菌种类树。

获取原文
获取原文并翻译 | 示例
           

摘要

Attempts to classify living organisms by their physical characteristics are as old as biology itself. The advent of protein and DNA sequencing—most notably the use of 16S ribosomal RNA—defined a new level of classification that now forms our basic understanding of the history of life on earth. High-throughput sequencing currently provides DNA sequences at an unprecedented rate, not only providing a wealth of information but also posing considerable analytical challenges. Here we present comparative genomics–based methods useful for automating evolutionary analysis between any number of species. As a practical example, we applied our method to the well-studied cyanobacterial lineage. The 24 cyanobacterial genomes compared here occupy a wide variety of environmental niches and play major roles in global carbon and nitrogen cycles. By integrating phylogenetic data inferred for upward of 1,000 protein-coding genes common to all or most cyanobacteria, we have reconstructed an evolutionary history of the phylum, establishing a framework for resolving key issues regarding the evolution of their metabolic and phenotypic diversity. Greater resolution on individual branches can be attained by telescoping inward to the larger set of conserved proteins between fewer taxa. The construction of all individual protein phylogenies allows for quantitative tree scoring, providing insight into the evolutionary history of each protein family as well as probing the limits of phylogenetic resolution. The tools incorporated here are fast, computationally tractable, and easily extendable to other phyla and provide a scaleable framework for contrasting and integrating the information present in thousands of protein-coding genes within related genomes.
机译:尝试根据生物的物理特征对生物进行分类与生物学本身一样古老。蛋白质和DNA测序的出现-最显着的是16S核糖体RNA的使用-定义了一个新的分类标准,现在构成了我们对地球生命史的基本理解。目前,高通量测序以前所未有的速度提供DNA序列,不仅提供了丰富的信息,而且还带来了巨大的分析挑战。在这里,我们介绍了基于比较基因组学的方法,可用于自动进行许多物种之间的进化分析。作为一个实际的例子,我们将我们的方法应用于研究充分的蓝细菌谱系。此处比较的24个蓝细菌基因组占据了各种各样的环境生态位,并且在全球碳和氮循环中起着重要作用。通过整合推断出所有或大多数蓝细菌共有的1,000个以上蛋白质编码基因的系统发育数据,我们重建了门菌的进化史,建立了解决其代谢和表型多样性进化的关键问题的框架。通过在更少的分类单元之间向内伸缩到更大的一组保守蛋白,可以实现单个分支上的更高分离度。所有单个蛋白质系统发育树的构建都可以对树进行定量评分,从而洞悉每个蛋白质家族的进化史,并探索系统发育分辨率的极限。此处包含的工具快速,易计算且易于扩展到其他门,并提供了可缩放的框架,用于对比和整合相关基因组中成千上万个蛋白质编码基因中的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号