首页> 外文期刊>Current Bioinformatics >MetalExplorer, a Bioinformatics Tool for the Improved Prediction of Eight Types of Metal-Binding Sites Using a Random Forest Algorithm with Two- Step Feature Selection
【24h】

MetalExplorer, a Bioinformatics Tool for the Improved Prediction of Eight Types of Metal-Binding Sites Using a Random Forest Algorithm with Two- Step Feature Selection

机译:Metalexplorer,一种使用随机森林算法改进了八种类型的金属绑定站预测的生物信息工具,具有两步特征选择

获取原文
获取原文并翻译 | 示例
           

摘要

Background: Metalloproteins are highly involved in many biological processes, including catalysis, recognition, transport, transcription, and signal transduction. The metal ions they bind usually play enzymatic or structural roles in mediating these diverse functional roles. Thus, thesystematic analysis and prediction of metal-binding sites using sequence and/or structural information are crucial for understanding their sequence-structure-function relationships. Objective: The objective of this work is to develop a new computational algorithm for improved predictionof major types of metal-binding sites. Method: We propose MetalExplorer (http://metalexplorer.erc.monash.edu.au/), a new machine learning-based method for predicting eight different types of metal-binding sites (Ca, Co, Cu, Fe, Ni, Mg, Mn, and Zn) in proteins. Our approach combines heterogeneoussequence-, structure-, and residue contact network-based features in a random forest machine-learning framework. Results: The predictive performance of MetalExplorer was tested by cross-validation and independent tests using non-redundant datasets of known structures. This method appliesa two-step feature selection approach based on the maximum relevance minimum redundancy and forward feature selection to identify the most informative features that contribute to the prediction performance. With a precision of 60%, MetalExplorer achieved high recall values, which ranged from59% to 88% for the eight metal ion types in fivefold cross-validation tests. Moreover, the common and type-specific features in the optimal subsets of all metal ions were characterized in terms of their contributions to the overall performance. Conclusion: In terms of both benchmark andindependent datasets at the 60% precision control level, MetalExplorer compared favorably with an existing metalloprotein prediction tool, SitePredict. MetalExplorer is expected to be a powerful tool for the accurate prediction of potential metal-binding sites and it should facilitate thefunctional analysis and rational design of novel metalloproteins.
机译:背景:金属蛋白在许多生物过程中高度涉及,包括催化,识别,运输,转录和信号转导。它们结合的金属离子通常在中介这些不同的功能作用中起酶促或结构作用。因此,使用序列和/或结构信息对金属结合位点的系统分析和预测对于理解其序列结构函数关系至关重要。目的:这项工作的目的是开发一种新的计算算法,用于改进主要类型的金属结合位点。方法:我们提出Metalexplorer(http://metalexplorer.erc.monash.edu.au/),这是一种基于新的机器学习方法,用于预测八种不同类型的金属绑定站点(Ca,Co,Cu,Fe,Ni, Mg,Mn和Zn)在蛋白质中。我们的方法在随机林机器学习框架中结合了基于序列,结构和残留的基于网络的特征。结果:使用已知结构的非冗余数据集进行交叉验证和独立测试测试Metalexplorer的预测性能。此方法基于最大相关性最小冗余和转发特征选择的两步特征选择方法,以确定有助于预测性能的最具信息性功能。精度为60%,Metalexplorer实现了高召回值,其八个金属离子类型在五倍交叉验证测试中的59%至88%。此外,在所有金属离子的最佳亚组中的常见和类型特征在其对整体性能的贡献方面的特征表征。结论:在60%精密控制水平的基准和依赖性数据集方面,Metalexplorer与现有的金属蛋白预测工具,SitePredict有利地比较。预计Metalexplorer将成为精确预测潜在金属结合位点的强大工具,它应该促进新型金属蛋白的官能分析和合理设计。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号