首页> 外文OA文献 >PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families
【2h】

PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families

机译:PRODOC:一种资源,用于比较拴系蛋白结构域架构与远程相关域家族的内置信息

摘要

PROtein Domain Organization and Comparison (PRODOC) comprises several programs that enable convenient comparison of proteins as a sequence of domains. The in-built dataset currently consists of ∼698 000 proteins from 192 organisms with complete genomic data, and all the SWISSPROT proteins obtained from the Pfam database. All the entries in PRODOC are represented as a sequence of functional domains, assigned using hidden Markov models, instead of as a sequence of amino acids. On average 69% of the proteins in the proteomes and 49% of the residues are covered by functional domain assignments. Software tools allow the user to query the dataset with a sequence of domains and identify proteins with the same or a jumbled or circularly permuted arrangement of domains. As it is proposed that proteins with jumbled or the same domain sequences have similar functions, this search tool is useful in assigning the overall function of a multi-domain protein. Unique features of PRODOC include the generation of alignments between multi-domain proteins on the basis of the sequence of domains and in-built information on distantly related domain families forming superfamilies. It is also possible using PRODOC to identify domain sharing and gene fusion events across organisms. An exhaustive genome–genome comparison tool in PRODOC also enables the detection of successive domain sharing and domain fusion events across two organisms. The tool permits the identification of gene clusters involved in similar biological processes in two closely related organisms. The URL for PRODOC is .
机译:蛋白质域结构和比较(PRODOC)包含几个程序,这些程序可以方便地比较蛋白质作为一个域的序列。内置数据集目前由来自192个生物的698698种蛋白质组成,具有完整的基因组数据,以及从Pfam数据库获得的所有SWISSPROT蛋白。 PRODOC中的所有条目都表示为使用隐马尔可夫模型分配的功能域序列,而不是氨基酸序列。蛋白质组中平均69%的蛋白质和49%的残基被功能域分配所覆盖。软件工具允许用户使用域序列查询数据集,并识别具有相同或混杂或环状排列的域排列的蛋白质。由于提出具有混杂或相同结构域序列的蛋白质具有相似的功能,因此该搜索工具可用于分配多域蛋白质的整体功能。 PRODOC的独特功能包括根据结构域序列和有关形成超家族的远距离相关结构域家族的内置信息生成多结构域蛋白之间的比对。使用PRODOC识别跨生物体的结构域共享和基因融合事件也是可能的。 PRODOC中详尽的基因组-基因组比较工具还可以检测两个生物之间连续的域共享和域融合事件。该工具可以识别与两个密切相关的生物体相似的生物过程中涉及的基因簇。 PRODOC的URL为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号