...
首页> 外文期刊>Proteomics >Analysis of the tryptic search space in UniProt databases
【24h】

Analysis of the tryptic search space in UniProt databases

机译:UniProt数据库中的胰蛋白酶搜索空间分析

获取原文
获取原文并翻译 | 示例

摘要

In this article, we provide a comprehensive study of the content of the Universal Protein Resource (UniProt) protein data sets for human and mouse. The tryptic search spaces of the UniProtKB (UniProt knowledgebase) complete proteome sets were compared with other data sets from UniProtKB and with the corresponding International Protein Index, reference sequence, Ensembl, and UniRef100 (where UniRef is UniProt reference clusters) organism-specific data sets. All protein forms annotated in UniProtKB (both the canonical sequences and isoforms) were evaluated in this study. In addition, natural and disease-associated amino acid variants annotated in UniProtKB were included in the evaluation. The peptide unicity was also evaluated for each data set. Furthermore, the peptide information in the UniProtKB data sets was also compared against the available peptide-level identifications in the main MS-based proteomics repositories. Identifying the peptides observed in these repositories is an important resource of information for protein databases as they provide supporting evidence for the existence of otherwise predicted proteins. Likewise, the repositories could use the information available in UniProtKB to direct reprocessing efforts on specific sets of peptides/proteins of interest. In summary, we provide comprehensive information about the different organism-specific sequence data sets available from UniProt, together with the pros and cons for each, in terms of search space for MS-based bottom-up proteomics workflows. The aim of the analysis is to provide a clear view of the tryptic search space of UniProt and other protein databases to enable scientists to select those most appropriate for their purposes.
机译:在本文中,我们对人和小鼠通用蛋白质资源(UniProt)蛋白质数据集的内容进行了全面的研究。将UniProtKB(UniProt知识库)完整蛋白质组的胰蛋白酶搜索空间与UniProtKB的其他数据集以及相应的国际蛋白质索引,参考序列,Ensembl和UniRef100(其中UniRef是UniProt参考簇)的生物体特定数据集进行了比较。 。在这项研究中评估了在UniProtKB中注释的所有蛋白质形式(规范序列和同工型)。此外,UniProtKB中注释的自然和与疾病相关的氨基酸变体也包括在评估中。还针对每个数据集评估了肽的唯一性。此外,还将UniProtKB数据集中的肽段信息与主要基于MS的蛋白质组学信息库中可用的肽段级鉴定进行了比较。鉴定在这些存储库中观察到的肽是蛋白质数据库的重要信息资源,因为它们为存在其他预测的蛋白质提供了支持证据。同样,存储库可以使用UniProtKB中提供的信息来指导对特定的目标肽/蛋白质组进行后处理。总之,就基于MS的自下而上的蛋白质组学工作流的搜索空间,我们提供了UniProt提供的有关不同生物体特定序列数据集的全面信息,以及每种方法的优缺点。分析的目的是提供UniProt和其他蛋白质数据库的胰蛋白酶搜索空间的清晰视图,以使科学家能够选择最适合其目的的蛋白质。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号