首页> 外文期刊>Scientific reports. >N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding
【24h】

N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding

机译:n-glyde:一种掺入撕开的二肽和基于图案的编码的两阶段n键合糖基化位点预测

获取原文
           

摘要

N-linked glycosylation is one of the predominant post-translational modifications involved in a number of biological functions. Since experimental characterization of glycosites is challenging, glycosite prediction is crucial. Several predictors have been made available and report high performance. Most of them evaluate their performance at every asparagine in protein sequences, not confined to asparagine in the N-X-S/T sequon. In this paper, we present N-GlyDE, a two-stage prediction tool trained on rigorously-constructed non-redundant datasets to predict N-linked glycosites in the human proteome. The first stage uses a protein similarity voting algorithm trained on both glycoproteins and non-glycoproteins to predict a score for a protein to improve glycosite prediction. The second stage uses a support vector machine to predict N-linked glycosites by utilizing features of gapped dipeptides, pattern-based predicted surface accessibility, and predicted secondary structure. N-GlyDE's final predictions are derived from a weight adjustment of the second-stage prediction results based on the first-stage prediction score. Evaluated on N-X-S/T sequons of an independent dataset comprised of 53 glycoproteins and 33 non-glycoproteins, N-GlyDE achieves an accuracy and MCC of 0.740 and 0.499, respectively, outperforming the compared tools. The N-GlyDE web server is available at http://bioapp.iis.sinica.edu.tw/N-GlyDE/ .
机译:N-连接的糖基化是涉及许多生物学功能的主要翻译后修改之一。由于血糖上的实验表征是挑战性的,因此糖化预测至关重要。已经提供了几种预测因子并报告了高性能。其中大多数评估它们在蛋白质序列中的每一种天冬酰胺的性能,而不是在N-X-S / T序列中限制到天冬酰胺。在本文中,我们呈现N-Glyde,在严格构造的非冗余数据集上训练的两级预测工具,以预测人蛋白质组中的N键合血糖技术。第一阶段使用培训糖蛋白和非糖蛋白的蛋白质相似性投票算法,以预测蛋白质以改善糖化预测的分数。第二阶段使用支持向量机通过利用覆盖二肽,基于图案的预测表面可访问性和预测的二级结构的特征来预测N链综合材料。基于第一阶段预测得分的第二阶段预测结果的权重调整,N-Glyde的最终预测结果来自于第一阶段预测得分。在由53个糖蛋白和33个非糖蛋白组成的独立数据集的N-X-S / T序列中评价,N-Glyde分别达到0.740和0.499的精度和MCC,优于比较的工具。 N-Glyde Web服务器可在http://bioapp.iis.sinica.edu.tw/n-glyde/上获取。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号