首页> 外文期刊>Bioinformatics >Computational prediction of N-linked glycosylation incorporating structural properties and patterns
【24h】

Computational prediction of N-linked glycosylation incorporating structural properties and patterns

机译:结合结构特性和模式的N-连接糖基化的计算预测

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: N-linked glycosylation occurs predominantly at the N-X-T/S motif, where X is any amino acid except proline. Not all N-X-T/S sequons are glycosylated, and a number of web servers for predicting N-linked glycan occupancy using sequence and/or residue pattern information have been developed. None of the currently available servers, however, utilizes protein structural information for the prediction of N-glycan occupancy. Results: Here, we describe a novel classifier algorithm, NGlycPred, for the prediction of glycan occupancy at the N-X-T/S sequons. The algorithm utilizes both structural as well as residue pattern information and was trained on a set of glycosylated protein structures using the Random Forest algorithm. The best predictor achieved a balanced accuracy of 0.687 under 10-fold cross-validation on a curated dataset of 479 N-X-T/S sequons and outperformed sequence-based predictors when evaluated on the same dataset. The incorporation of structural information, including local contact order, surface accessibility/composition and secondary structure thus improves the prediction accuracy of glycan occupancy at the N-X-T/S consensus sequon.
机译:动机:N-联糖基化主要发生在N-X-T / S基序上,其中X是脯氨酸以外的任何氨基酸。并非所有的N-X-T / S序列都被糖基化,并且已经开发出许多用于使用序列和/或残基模式信息预测N-联聚糖占有率的Web服务器。但是,当前没有可用的服务器都没有利用蛋白质结构信息来预测N-聚糖的占用。结果:在这里,我们描述了一种新颖的分类器算法NGlycPred,用于预测N-X-T / S后代的糖基占用。该算法利用结构和残基模式信息,并使用随机森林算法在一组糖基化蛋白质结构上进行训练。在479个N-X-T / S序列的精选数据集上,最佳预测变量在10倍交叉验证下达到了0.687的平衡准确度,并且在同一数据集上进行评估时,其性能优于基于序列的预测变量。因此,包括局部接触顺序,表面可及性/组成和二级结构在内的结构信息的结合提高了N-X-T / S共有序列中聚糖占用的预测准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号