首页> 外文期刊>Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological >SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles
【24h】

SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles

机译:SPINE X:通过多步学习以及对溶剂可及表面积和主链扭转角的预测,改进蛋白质二级结构的预测

获取原文
获取原文并翻译 | 示例
           

摘要

Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0% accuracy based on 10-fold cross validation (Q_3). Surpassing 81% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3%, 82.3% and 81.8%, respectively. The prediction accuracy is further improved to 83.8% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0% of 1833 proteins, compared to 17.6% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5%) and over predicts them by 7%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/
机译:蛋白质二级结构的准确预测对于准确的序列比对,三维结构建模和功能预测至关重要。然而,在过去的十年中,从序列开始从头开始预测二级结构的准确性仅从约77%增至80%。在这里,我们通过迭代地将二级结构预测与溶剂可及性和主链扭转角的预测耦合起来,开发了一种多步神经网络算法。我们的称为SPINE X的方法应用于先前为SPINE的第一个版本构建的2640个蛋白质(25%序列同一性截断值)数据集,基于10倍交叉验证(Q_3)达到了82.0%的准确性。通过使用独立构建的1833条蛋白质链的测试数据集,最近构建的1975种蛋白质和117个CASP 9目标(结构预测技术的关键评估)数据集,进一步证实了SPINE X超过81%的准确性,准确性为81.3%,82.3 %和81.8%。如果将上面使用的DSSP分配替换为更一致的共有二级结构分配方法,则对于2640个蛋白质的数据集,预测准确性将进一步提高到83.8%。与流行的PSIPRED和CASP获奖结构预测技术进行了比较。 SPINE X可以正确预测1833种蛋白质中21.0%的螺旋和薄片数量,而PSIPRED则为17.6%。它进一步表明,SPINE X始终可以对螺旋状残基(6%)做出更准确的预测,而不会发生过度预测,而PSIPRED可以对线圈残基(3-5%)做出更准确的预测,而对线圈残基的过剩预测则为7%。可从http://sparks.informatics.iupui.edu/获得SPINE X Server及其培训/测试数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号