首页> 外文期刊>Journal of Computer-Aided Molecular Design >EFindSite: Improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands
【24h】

EFindSite: Improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands

机译:EFindSite:使用元线程,机器学习和辅助配体改进蛋白质模型中配体结合位点的预测

获取原文
获取原文并翻译 | 示例
           

摘要

Molecular structures and functions of the majority of proteins across different species are yet to be identified. Much needed functional annotation of these gene products often benefits from the knowledge of protein-ligand interactions. Towards this goal, we developed eFindSite, an improved version of FINDSITE, designed to more efficiently identify ligand binding sites and residues using only weakly homologous templates. It employs a collection of effective algorithms, including highly sensitive meta-threading approaches, improved clustering techniques, advanced machine learning methods and reliable confidence estimation systems. Depending on the quality of target protein structures, eFindSite outperforms geometric pocket detection algorithms by 15-40 % in binding site detection and by 5-35 % in binding residue prediction. Moreover, compared to FINDSITE, it identifies 14 % more binding residues in the most difficult cases. When multiple putative binding pockets are identified, the ranking accuracy is 75-78 %, which can be further improved by 3-4 % by including auxiliary information on binding ligands extracted from biomedical literature. As a first across-genome application, we describe structure modeling and binding site prediction for the entire proteome of Escherichia coli. Carefully calibrated confidence estimates strongly indicate that highly reliable ligand binding predictions are made for the majority of gene products, thus eFindSite holds a significant promise for large-scale genome annotation and drug development projects. eFindSite is freely available to the academic community at http://www.brylinski.org/efindsite.
机译:跨物种的大多数蛋白质的分子结构和功能尚未确定。这些基因产物的许多需要的功能注释通常受益于蛋白质-配体相互作用的知识。为了实现这一目标,我们开发了FINDSITE的改进版本eFindSite,旨在仅使用弱同源模板更有效地识别配体结合位点和残基。它采用了一系列有效的算法,包括高度敏感的元线程方法,改进的聚类技术,先进的机器学习方法和可靠的置信度估计系统。根据目标蛋白质结构的质量,eFindSite在结合位点检测方面比几何口袋检测算法要好15-40%,在结合残基预测方面要优于5-35%。而且,与FINDSITE相比,在最困难的情况下,它可以识别出14%的结合残基。当鉴定出多个推定的结合口袋时,定级准确度为75-78%,通过包括有关从生物医学文献中提取的结合配体的辅助信息,可以将其准确度提高3-4%。作为第一个跨基因组的应用程序,我们描述了整个大肠杆菌蛋白质组的结构建模和结合位点预测。仔细校准的置信度估计值强烈表明,大多数基因产物均具有高度可靠的配体结合预测,因此eFindSite对大规模基因组注释和药物开发项目具有重大希望。 eFindSite可在http://www.brylinski.org/efindsite上免费提供给学术界。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号