首页> 外文期刊>Journal of Translational Medicine >Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins
【24h】

Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins

机译:通过整合蛋白质的序列,结构和功能信息来预测使用机器学习的磷酸化位点

获取原文
           

摘要

Post-translational modification (PTM) is a biological process that alters proteins and is therefore involved in the regulation of various cellular activities and pathogenesis. Protein phosphorylation is an essential process and one of the most-studied PTMs: it occurs when a phosphate group is added to serine (Ser, S), threonine (Thr, T), or tyrosine (Tyr, Y) residue. Dysregulation of protein phosphorylation can lead to various diseases—most commonly neurological disorders, Alzheimer’s disease, and Parkinson’s disease—thus necessitating the prediction of S/T/Y residues that can be phosphorylated in an uncharacterized amino acid sequence. Despite a surplus of sequencing data, current experimental methods of PTM prediction are time-consuming, costly, and error-prone, so a number of computational methods have been proposed to replace them. However, phosphorylation prediction remains limited, owing to substrate specificity, performance, and the diversity of its features. In the present study we propose machine-learning-based predictors that use the physicochemical, sequence, structural, and functional information of proteins to classify S/T/Y phosphorylation sites. Rigorous feature selection, the minimum redundancy/maximum relevance approach, and the symmetrical uncertainty method were employed to extract the most informative features to train the models. The RF and SVM models generated using diverse feature types in the present study were highly accurate as is evident from good values for different statistical measures. Moreover, independent test sets and benchmark validations indicated that the proposed method clearly outperformed the existing methods, demonstrating its ability to accurately predict protein phosphorylation. The results obtained in the present work indicate that the proposed computational methodology can be effectively used for predicting putative phosphorylation sites further facilitating discovery of various biological processes mechanisms.
机译:翻译后修饰(PTM)是改变蛋白质的生物学过程,因此参与各种细胞活性和发病机制的调节。蛋白质磷酸化是一种必要的方法,并且最多研究的PTMS:当将磷酸盐基团加入丝氨酸(Ser,S),苏氨酸(Thr,T)或酪氨酸(Tyr,Y)残余物中时发生。蛋白质磷酸化的失调可以导致各种疾病 - 最常见的神经疾病,阿尔茨海默病和帕金森病 - 因此需要预测可以在非特征化氨基酸序列中磷酸化的S / T / Y残基。尽管测序数据剩余,但PTM预测的当前实验方法是耗时,昂贵和容易出错的,所以已经提出了许多计算方法来取代它们。然而,由于底物特异性,性能和其特征的多样性,磷酸化预测仍然有限。在本研究中,我们提出了基于机器学习的预测因子,其使用蛋白质的物理化学,序列,结构和功能信息来分类S / T / Y磷酸化位点。严格的特征选择,最小冗余/最大相关方法以及对称的不确定性方法被用来提取培训模型的最具信息性的功能。在本研究中使用不同特征类型生成的RF和SVM模型非常准确,因为不同的统计措施的良好值明显。此外,独立的测试集和基准验证表明,所提出的方法显然优于现有方法,证明了其准确预测蛋白质磷酸化的能力。在本作工作中获得的结果表明,所提出的计算方法可以有效地用于预测推定的磷酸化位点进一步促进各种生物过程机制的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号