首页> 美国卫生研究院文献>Molecular Biology and Evolution >Detecting and Locating Whole Genome Duplications on a Phylogeny: A Probabilistic Approach
【2h】

Detecting and Locating Whole Genome Duplications on a Phylogeny: A Probabilistic Approach

机译:检测和定位系统发育全基因组重复:一种概率方法。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Whole genome duplications (WGDs) followed by massive gene loss occurred in the evolutionary history of many groups. WGDs are usually inferred from the age distribution of paralogs (Ks-based methods) or from gene collinearity data (synteny). However, Ks-based methods are restricted to detect the recent WGDs due to saturation effects and the difficulty to date old duplicates, and synteny is difficult to reconstruct for distantly related species. Recently, Jiao et al. (Jiao Y, Wickett N, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, et al. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473:97–100) introduced an empirical method that aims to detect a peak in duplication ages among nodes selected from a previous phylogenetic analysis. In this context, we present here two rigorous methods based on data from multiple gene families and on a new probabilistic model. Our model assumes that all gene lineages are instantaneously duplicated at the WGD event with a possible almost-immediate loss of some extra copies. Our reconciliation method relies on aligned molecular sequences, whereas our gene count method relies only on gene count data across species. We show, using extensive simulations, that both methods have a good detection power. Surprisingly, the gene count method enjoys no loss of power compared with the reconciliation method, despite the fact that sequence information is not used. We finally illustrate the performance of our methods on a benchmark yeast data set. Both methods are able to detect the well-known WGD in the Saccharomyces cerevisiae clade and agree on a small retention rate at the WGD, as established by synteny-based methods.
机译:在许多群体的进化史中,全基因组重复(WGD)继之以大量基因丧失。 WGD通常是根据旁系同源物的年龄分布(基于Ks的方法)或基因共线性数据(同义)来推断的。但是,基于Ks的方法由于饱和效应和难以确定旧副本的日期而被限制用于检测最新的WGD,并且很难为远缘物种重建共性。最近,焦等人。 (Jiao Y,Wickett N,Ayyampalayam S,Chandebali AS,Landherr L,Ralph PE,Tomsho LP,Hu Y,Liang H,Soltis PS等。2011。种子植物和被子植物的祖先多倍体。自然473:97–100 )引入了一种经验方法,旨在检测从先前的系统发育分析中选出的节点之间复制年龄的峰值。在这种情况下,我们在此基于来自多个基因家族的数据和新的概率模型提出两种严格的方法。我们的模型假设所有基因谱系在WGD事件中都是瞬间重复的,并且可能会立即损失一些额外的拷贝。我们的和解方法依赖于对齐的分子序列,而我们的基因计数方法仅依赖于跨物种的基因计数数据。我们通过广泛的仿真显示,这两种方法都具有良好的检测能力。出乎意料的是,尽管不使用序列信息,但基因计数方法与调节方法相比没有功率损失。最后,我们在基准酵母数据集上说明了我们方法的性能。两种方法都能够检测酿酒酵母进化枝中众所周知的WGD,并同意在WGD中保留率低,这是通过基于协同方法的方法建立的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号