首页> 外文期刊>BMC Genomics >Enrichment of Triticum aestivum gene annotations using ortholog cliques and gene ontologies in other plants
【24h】

Enrichment of Triticum aestivum gene annotations using ortholog cliques and gene ontologies in other plants

机译:利用直系同源系和基因本体论在其他植物中丰富普通小麦的基因注释

获取原文
           

摘要

While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A. thaliana and O. sativa provide a plethora of gene annotations that can be used as potential starting points for wheat gene annotations, proven that solid cross-species gene-to-gene and protein-to-protein correspondences are provided. DNA and protein sequences and corresponding annotations for T. aestivum and 9 other plant species were collected from Ensembl Plants release 22 and curated. Cliques of predicted 1-to-1 orthologs were identified and an annotation enrichment model was defined based on existing gene-GO term associations and phylogenetic relationships among wheat and 9 other plant species. A total of 13 cliques of size 10 were identified, which represent putative functionally equivalent genes and proteins in the 10 plant species. Eighty-five new and more specific GO terms were associated with wheat genes in the 13 cliques of size 10, which represent a 65% increase compared with the previously 130 known GO terms. Similar expression patterns for 4 genes from Arabidopsis, barley, maize and rice in cliques of size 10 provide experimental evidence to support our model. Overall, based on clique size equal or larger than 3, our model enriched the existing gene-GO term associations for 7,838 (8%) wheat genes, of which 2,139 had no previous annotation. Our novel comparative genomics approach enriches existing T. aestivum gene annotations based on cliques of predicted 1-to-1 orthologs, phylogenetic relationships and existing gene ontologies from 9 other plant species.
机译:尽管巨大的多国小麦测序工作已接近尾声,但小麦基因和蛋白质的注释过程仍处于起步阶段。先前对模式植物生物(如拟南芥和苜蓿)进行的实验研究提供了大量的基因注释,可用作小麦基因注释的潜在起点,证明了固体跨物种的基因对基因和蛋白质提供了蛋白质之间的对应关系。从Ensembl Plants版本22收集并整理了小麦和其他9种植物的DNA和蛋白质序列以及相应的注释。确定了预测的一对一直向同源物的群体,并基于小麦和其他9种植物之间现有的基因-GO术语关联和系统发育关系,定义了注释富集模型。总共鉴定出13个大小10的群体,它们代表了10种植物中功能上等价的基因和蛋白质。 13个大小为10的群体中有85个新的和更具体的GO术语与小麦基因相关,与以前的130个已知GO术语相比,代表增加了65%。大小为10的群体中来自拟南芥,大麦,玉米和水稻的4个基因的相似表达模式提供了支持我们模型的实验证据。总体而言,基于等于或大于3的集团规模,我们的模型丰富了7838个(8%)小麦基因的现有基因-GO术语关联,其中2139个以前没有注释。我们新颖的比较基因组学方法基于预测的一对一直系同源物,系统发育关系和来自其他9种植物的现有基因本体论,丰富了现有的普通小麦基因注释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号