首页> 美国卫生研究院文献>Frontiers in Plant Science >UniProtKB amid the turmoil of plant proteomics research
【2h】

UniProtKB amid the turmoil of plant proteomics research

机译:在植物蛋白质组学研究的动荡中UniProtKB

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The UniProt KnowledgeBase (UniProtKB) provides a single, centralized, authoritative resource for protein sequences and functional information. The majority of its records is based on automatic translation of coding sequences (CDS) provided by submitters at the time of initial deposition to the nucleotide sequence databases (INSDC). This article will give a general overview of the current situation, with some specific illustrations extracted from our annotation of Arabidopsis and rice proteomes. More and more frequently, only the raw sequence of a complete genome is deposited to the nucleotide sequence databases and the gene model predictions and annotations are kept in separate, specialized model organism databases (MODs). In order to be able to provide the complete proteome of model organisms, UniProtKB had to implement pipelines for import of protein sequences from Ensembl and EnsemblGenomes. A single genome can be the target of several unrelated sequencing projects and the final assembly and gene model predictions may diverge quite significantly. In addition, several cultivars of the same species are often sequenced – 1001 Arabidopsis cultivars are currently under way – and the resulting proteomes are far from being identical. Therefore, one challenge for UniProtKB is to store and organize these data in a convenient way and to clearly defined reference proteomes that should be made available to users. Manual annotation is one of the landmarks of the Swiss-Prot section of UniProtKB. Besides adding functional annotation, curators are checking, and often correcting, gene model predictions. For plants, this task is limited to Arabidopsis thaliana and Oryza sativa subsp. japonica. Proteomics data providing experimental evidences confirming the existence of proteins or identifying sequence features such as post-translational modifications are also imported into UniProtKB records and the knowledgebase is cross-referenced to numerous proteomics resource.
机译:UniProt知识库(UniProtKB)为蛋白质序列和功能信息提供了一个单一的,集中的,权威的资源。它的大部分记录基于提交者在初次沉积到核苷酸序列数据库(INSDC)时提供的编码序列(CDS)的自动翻译。本文将概述当前情况,并从我们对拟南芥和水稻蛋白质组学的注释中提取一些具体插图。越来越频繁地,只有完整基因组的原始序列被存放到核苷酸序列数据库中,而基因模型的预测和注释被保存在单独的,专门的模型生物数据库(MODs)中。为了能够提供模型生物的完整蛋白质组,UniProtKB必须实现从Ensembl和EnsemblGenomes导入蛋白质序列的管道。单个基因组可能是几个不相关的测序项目的目标,最终组装和基因模型的预测可能会有很大差异。此外,经常会对同一物种的几个品种进行测序(目前正在进行1001个拟南芥品种),并且所得蛋白质组远非相同。因此,UniProtKB的一个挑战是以一种方便的方式存储和组织这些数据,并为用户提供明确定义的参考蛋白质组。手动注释是UniProtKB的Swiss-Prot部分的地标之一。除了添加功能注释外,策展人还在检查并经常纠正基因模型的预测。对于植物,此任务仅限于拟南芥和稻(Aryza sativa)亚种。粳稻蛋白质组学数据提供了实验证据,可证实蛋白质的存在或鉴定序列特征(例如翻译后修饰),也被导入UniProtKB记录中,并且知识库被大量蛋白质组学资源交叉引用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号