首页> 美国卫生研究院文献>PLoS Computational Biology >Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes
【2h】

Predicting Protein Function with Hierarchical Phylogenetic Profiles: The Gene3D Phylo-Tuner Method Applied to Eukaryotic Genomes

机译:用分级系统发育谱预测蛋白质功能:Gene3D Phylo-Tuner方法应用于真核基因组

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

“Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes.
机译:“系统发育谱”是基于这样的假设,即在进化过程中,功能或物理相互作用的基因很可能以共依赖性方式被遗传或消除。现在,创建直系同源基因的存在与缺失概况是鉴定功能相关基因的一种常见而有效的方法。在这种方法中,正确地确定拼字法,作为鉴定两个基因之间功能对等的一种手段,是关键而又不容易的步骤,并在很大程度上解释了为什么以前在该领域的工作主要集中在使用原核生物中存在与不存在的特征上。在这里,我们证明了真核基因组具有很高比例的多基因家族,它们的系统发育谱分布在存在-缺乏信息含量方面很差。此功能使他们易于进行拼写错误分配,不适合基于标准配置文件的预测方法。使用来自Gene3D数据库的13个完整真核基因组的CATH结构域分配,我们开发了系统发育谱方法的新型修饰方法,该方法使用每个域超家族的基因组拷贝数来预测功能关系。在我们的方法中,将超家族分为十个级别的序列同一性(从30%到100%)进行细分,并在每个级别上建立系统发育谱。使用规范化的欧几里得距离比较所有配置文件,以识别其域副本数具有相关更改的配置文件。我们证明了,当两个蛋白质家族的谱图在捕获其功能关系的相似性水平上进行比较时,将用强共进化信号“自动调节”。我们的方法发现了常规存在与不存在特征比较无法检测到的功能关系,并且不需要先验任何固定标准来定义直系同源基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号