首页> 外文期刊>BMC Structural Biology >Identification of similar regions of protein structures using integrated sequence and structure analysis tools
【24h】

Identification of similar regions of protein structures using integrated sequence and structure analysis tools

机译:使用整合的序列和结构分析工具鉴定蛋白质结构的相似区域

获取原文
           

摘要

Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ webcite and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO) ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization, development of protein structure and function analysis tools are a necessary endeavor. We have developed a useful application towards a solution to this problem using common structural and sequence based analysis tools. These approaches are able to find statistically significant environments in a database of protein structure, and the method is able to quantify how closely associated each environment is to a predicted functional annotation.
机译:背景技术从其结构了解蛋白质功能是一个具有挑战性的问题。用于寻找同源性的基于序列的方法广泛用于结构和功能的注释。蛋白质域的3D结构信息及其相互作用提供了与序列信息的结构功能关系的互补视图。我们已经开发了一个网站http://www.sblest.org/ webcite和一个Web服务API,该API使用户能够提交蛋白质结构并识别具有统计学意义的邻居,以及使用一系列序列和序列进行匹配的基础结构环境。结构分析工具。为此,我们将基于S-BLEST,PSI-BLAST和HMMer的超家族预测进行了集成,从而为预测SCOP超家族,EC编号和GO术语以及鉴定相关的蛋白质结构环境提供了独特的集成视图有了这个预测。此外,我们扩展了UCSF Chimera和PyMOL以支持我们的Web服务,以便用户可以表征自己感兴趣的蛋白质。结果用户可以提交自己的查询或使用PDB中已经存在的结构。当前,用户可以查询的数据库包括流行的结构数据集ASTRAL 40 v1.69,ASTRAL 95 v1.69,CLUSTER50,CLUSTER70和CLUSTER90和PDBSELECT25。结果可以直接从站点下载,包括功能预测,最保守的环境分析和查询蛋白的自动注释。这些结果反映了PSI-BLAST,HMMer和S-BLEST的命中率。我们评估了在SCOP ID,基因本体(GO)ID和EC号上进行注释转移的效果。该方法非常高效且完全自动化,通常需要大约15分钟才能获得400个残基的蛋白质。结论利用结构基因组学的方法来确定几乎没有功能表征的结构,开发蛋白质结构和功能分析工具是必要的努力。我们已经开发了一种有用的应用程序,可以使用常见的基于结构和序列的分析工具来解决该问题。这些方法能够在蛋白质结构的数据库中找到具有统计意义的环境,并且该方法能够量化每个环境与预测的功能注释的关联程度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号