首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Protein Ubiquitylation and Sumoylation Site Prediction Based on Ensemble and Transfer Learning
【24h】

Protein Ubiquitylation and Sumoylation Site Prediction Based on Ensemble and Transfer Learning

机译:基于集成和转移学习的蛋白质泛素化和糖基化位点预测

获取原文

摘要

Ubiquitylation, a typical post-translational modification (PTM), plays an important role in signal transduction, apoptosis and cell proliferation. A ubiquitylation like PTM, sumoylation also may affect gene mapping, expression and genomic replication. Over the past two decades, machine learning has been widely employed in protein ubiquitylation and sumoylation site prediction tools. These existing tools require feature engineering, but failed to provide general interpretable features and probably underutilized the growing amount of data. This prompted us to propose a deep learning-based model that integrates multiple convolution and fully-connected layers of seven supervised learning sub-models to extract deep representations from protein sequences and physico-chemical properties (PCPs). Especially, we divided PCPs into 6 clusters and customized deep networks accordingly for handling the high correlations among one cluster. A stacking ensemble strategy was applied to combine these deep representations to make prediction. Furthermore, with the advantage of transfer learning, our deep learning model can work well on protein sumoylation site prediction as well after fine-tuning. On the high-quality annotated database Swiss-Prot, our model outperformed several well-known ubiquitylation and sumoylation site prediction tools. Our code is freely available at https://github.com/ruiwcoding/DeepUbiSumoPre.
机译:泛素化是一种典型的翻译后修饰(PTM),在信号转导,细胞凋亡和细胞增殖中起着重要作用。泛素化(如PTM),磺酰化也可能影响基因定位,表达和基因组复制。在过去的二十年中,机器学习已广泛应用于蛋白质泛素化和磺基化位点预测工具中。这些现有工具需要要素工程,但未能提供一般可解释的要素,并且可能未充分利用不断增长的数据量。这促使我们提出了一个基于深度学习的模型,该模型集成了多个卷积和七个监督学习子模型的全连接层,以从蛋白质序列和理化特性(PCP)中提取深度表示。尤其是,我们将PCP分为6个集群,并相应地定制了深度网络,以处理一个集群之间的高度相关性。应用了堆叠合奏策略来组合这些深层表示以进行预测。此外,凭借转移学习的优势,我们的深度学习模型在微调后也可以很好地用于蛋白质相加位点的预测。在高质量的带注释的数据库Swiss-Prot上,我们的模型优于几种众所周知的泛素化和磺基化位点预测工具。我们的代码可从https://github.com/ruiwcoding/DeepUbiSumoPre免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号