首页> 外文期刊>Bioinformatics >Using shared genomic synteny and shared protein functions to enhance the identification of orthologous gene pairs
【24h】

Using shared genomic synteny and shared protein functions to enhance the identification of orthologous gene pairs

机译:使用共享的基因组同义性和共享的蛋白质功能来增强直系同源基因对的鉴定

获取原文
获取原文并翻译 | 示例
       

摘要

MOTIVATION: The identification of orthologous gene pairs is generally based on sequence similarity. Gene pairs that are mutually 'best hits' between the genomes being compared are asserted to be orthologs. Although this method identifies most orthologous gene pairs with high confidence, it will miss a fraction of them, especially genes in duplicated gene families. In addition, the approach depends heavily on the completeness and quality of gene annotation. When the gene sequences are not correctly represented the approach is unlikely to find the correct ortholog. To overcome these limitations, we have developed an approach to identify orthologous gene pairs using shared chromosomal synteny and the annotation of protein function. RESULTS: Assembled mouse and human genomes were used to identify the regions of conserved synteny between these genomes. 'Syntenic anchors' are conserved non-repetitive locations between mouse and human genomes. Using these anchors, we identified blocks of sequences that contain consistently ordered anchors between the two genomes (syntenic blocks). The synteny information has been used to help us identify orthologous gene pairs between mouse and human genomes. The approach combines the mutual selection of the best tBlastX hits between human and mouse transcripts, and inferring gene orthologous relationships based on sharing syntenic anchors, collocating in the same syntenic blocks and sharing the same annotated protein function. Using this approach, we were able to find 19,357 orthologous gene pairs between human and mouse genomes, a 20% increase in the number of orthologs identified by conventional approaches.
机译:动机:直系同源基因对的鉴定通常基于序列相似性。被比较的基因组之间相互“最佳匹配”的基因对被认为是直系同源物。尽管此方法可以高信度地识别大多数直系同源基因对,但会丢失其中的一小部分,尤其是重复基因家族中的基因。另外,该方法在很大程度上取决于基因注释的完整性和质量。当基因序列未正确表示时,该方法不太可能找到正确的直系同源物。为了克服这些局限性,我们开发了一种使用共享染色体同构和蛋白质功能注释来鉴定直系同源基因对的方法。结果:组装的小鼠和人类基因组被用于识别这些基因组之间保守的同义区域。 “同系锚点”是小鼠和人类基因组之间保守的非重复位置。使用这些锚点,我们确定了在两个基因组之间包含一致排序的锚点的序列区块(同义区块)。协同信息已用于帮助我们识别小鼠和人类基因组之间的直系同源基因对。该方法结合了人类和小鼠转录本之间最佳tBlastX命中的相互选择,并基于共享同构锚点,并置在同一个同构块中以及共享同一个注释的蛋白质功能来推断基因直系同源关系。使用这种方法,我们能够在人和小鼠基因组之间找到19,357个直系同源基因对,这比传统方法鉴定的直系同源物数量增加了20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号