首页> 美国卫生研究院文献>other >A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
【2h】

A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation

机译:基于系统发育的基准测试基准测试揭示了基于功能的验证的局限性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a “core” species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at .
机译:正确的拼写预测对于后基因组时代的许多应用至关重要。由于缺乏广泛接受的基准测试,因此无法对正畸推论进行全面分析。到目前为止,直系同源物之间的功能注释可作为性能代理。但是,这违反了正统学的基本原理,即进化论的定义,但由于大多数物种的实验证据有限,因此通常不适用。因此,我们构建了高质量的“金标准”直系同源群体,可以作为细菌物种直系同源推断的基准。本文中,我们使用此数据集来证明1)为什么基于人工进化的系统发育数据集比其他流行的实践更适合用于基准矫正,以及2)它如何通过仔细的错误量化指导数据库设计和参数化。更具体地说,我们说明了基于功能的测试通常如何无法识别错误的分配,从而误判了正交推理方法的真实性能。我们还研究了我们的数据集如何指导“核心”物种库的选择,以提高检测精度。我们得出的结论是,在适当的进化距离内包括更多的基因组会影响正畸检测的整体质量。策划的基因家族,称为参考直系同源群,可在上公开获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号