首页> 外文期刊>FEBS letters. >Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.
【24h】

Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

机译:DNA结合位点的残基水平预测及其在DNA结合蛋白预测中的应用。

获取原文
获取原文并翻译 | 示例
           

摘要

Protein-DNA interactions are crucial to many cellular activities such as expression-control and DNA-repair. These interactions between amino acids and nucleotides are highly specific and any aberrance at the binding site can render the interaction completely incompetent. In this study, we have three aims focusing on DNA-binding residues on the protein surface: to develop an automated approach for fast and reliable recognition of DNA-binding sites; to improve the prediction by distance-dependent refinement; use these predictions to identify DNA-binding proteins. We use a support vector machines (SVM)-based approach to harness the features of the DNA-binding residues to distinguish them from non-binding residues. Features used for distinction include the residue's identity, charge, solvent accessibility, average potential, the secondary structure it is embedded in, neighboring residues, and location in a cationic patch. These features collected from 50 proteins are used to train SVM. Testing is then performed on another set of 37 proteins, much larger than any testing set used in previous studies. The testing set has no more than 20% sequence identity not only among its pairs, but also with the proteins in the training set, thus removing any undesired redundancy due to homology. This set also has proteins with an unseen DNA-binding structural class not present in the training set. With the above features, an accuracy of 66% with balanced sensitivity and specificity is achieved without relying on homology or evolutionary information. We then develop a post-processing scheme to improve the prediction using the relative location of the predicted residues. Balanced success is then achieved with average sensitivity, specificity and accuracy pegged at 71.3%, 69.3% and 70.5%, respectively. Average net prediction is also around 70%. Finally, we show that the number of predicted DNA-binding residues can be used to differentiate DNA-binding proteins from non-DNA-binding proteins with an accuracy of 78%. Results presented here demonstrate that machine-learning can be applied to automated identification of DNA-binding residues and that the success rate can be ameliorated as more features are added. Such functional site prediction protocols can be useful in guiding consequent works such as site-directed mutagenesis and macromolecular docking.
机译:蛋白质-DNA相互作用对于许多细胞活动(例如表达控制和DNA修复)至关重要。氨基酸和核苷酸之间的这些相互作用是高度特异性的,并且在结合位点的任何异常均可使相互作用完全丧失能力。在这项研究中,我们有三个目标是针对蛋白质表面的DNA结合残基:开发一种自动方法来快速可靠地识别DNA结合位点;通过距离相关的细化改进预测;使用这些预测来识别DNA结合蛋白。我们使用基于支持向量机(SVM)的方法来利用DNA结合残基的特征,以将它们与非结合残基区分开。用于区分的特征包括残基的身份,电荷,溶剂可及性,平均电势,其嵌入的二级结构,相邻的残基以及在阳离子补丁中的位置。从50种蛋白质中收集的这些特征用于训练SVM。然后对另一组37种蛋白质进行测试,该蛋白质比以前的研究中使用的任何测试套件都要大得多。测试集不仅在其对之间而且与训练集中的蛋白质具有不超过20%的序列同一性,因此消除了由于同源性造成的任何不希望的冗余。该组还具有训练组中不存在的具有未知DNA结合结构类别的蛋白质。通过上述功能,无需依赖同源性或进化信息即可获得66%的准确度和特异性,并达到平衡。然后,我们开发一种后处理方案,以使用预测残基的相对位置来改善预测。然后,平均灵敏度,特异性和准确性分别达到71.3%,69.3%和70.5%,从而获得平衡的成功。平均净预测值也约为70%。最后,我们证明了预测的DNA结合残基的数量可用于区分DNA结合蛋白与非DNA结合蛋白,准确度为78%。此处提供的结果表明,机器学习可以应用于DNA结合残基的自动识别,并且随着添加更多功能,成功率可以得到改善。此类功能性位点预测协议可用于指导后续工作,例如定点诱变和大分子对接。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号