首页> 外文期刊>International Journal of Quantum Chemistry >Prediction of Secondary Protein Structure with Binary Coding Patterns of Amino Acid and Nucleotide Physicochemical Properties
【24h】

Prediction of Secondary Protein Structure with Binary Coding Patterns of Amino Acid and Nucleotide Physicochemical Properties

机译:氨基酸二元编码模式和核苷酸理化性质预测二级蛋白质结构

获取原文
获取原文并翻译 | 示例
           

摘要

We present binary coding algorithm for the alpha-and beta-protein fold prediction. The method links amino acid molecular polarity patterns and physicochemical properties of nucleotide bases coded by means of a binary addresses. Primary sequences that define secondary protein structure were analyzed with respect to the symbolic oligopeptides (SO) obtained by the reduction of the 20 amino acid letter alphabet into a binary alphabet of nonolar group 0 (W, C, I,, F, M, V,L, Y) and polar group 1 (Q, R, H, K, N, E, D, S, G, T, A, P). The groups were extracted from the Grantham polarity scale with the clustering around medoids procedure. The transformation of protein strings into binary coding patterns of the polar and nonpolar amino acid groups reduced analyzed elements within the protein motif of length n by the factor of 10~n. SMO learning algorithm for the support vector machines was applied to classify alpha-helices and beta-strnds. It was shown that the relative frequencies of binary hexapetides classify all 174 nonhomologous alpha-and beta-protein folds from the Jpred database with 100% accuracy. The results of 10-fold cross-validation and leave-one-out test were 86.78%. Classification tree confirmed the results of SMO analysis and correctly classified 100% of the folds by means of 9 binary hexapetides. Linear block triple-check code was proposed for the description of hexapeptide pattenrs. The presented method enables simple, quick, and accurate prediction of alpha-and beta-protein folding types from the primary amino acid and nucleotide sequences on a personal computer. Our results imply that few amino acid polarity patterns specified by the nucleotide physicochemical properties describe basic protein folding types with > 90% accuracy.
机译:我们目前的α和β蛋白折叠预测的二进制编码算法。该方法链接了通过二进制地址编码的核苷酸碱基的氨基酸分子极性模式和理化特性。针对通过将20个氨基酸的字母字母还原为壬醇第0组(W,C,I,F,M,V的二元字母)而获得的符号寡肽(SO),分析了定义二级蛋白质结构的一级序列,L,Y)和极性基团1(Q,R,H,K,N,E,D,S,G,T,A,P)。这些组是从Grantham极性量表中提取的,并以围绕类固醇的程序进行聚类。将蛋白质串转化为极性和非极性氨基酸基团的二进制编码模式,将长度为n的蛋白质基序中的分析元素减少了10〜n。支持向量机的SMO学习算法被用于对alpha-螺旋和beta-strnds进行分类。结果表明,二元六肽的相对频率以100%的准确度对来自Jpred数据库的所有174个非同源α-和β-蛋白质折叠进行分类。 10倍交叉验证和留一法测试的结果为86.78%。分类树确认了SMO分析的结果,并通过9种二元六肽正确分类了100%的折叠。提出了线性嵌段三联校验码来描述六肽模式。所提出的方法使得能够从个人计算机上的一级氨基酸和核苷酸序列简单,快速和准确地预测α-和β-蛋白折叠类型。我们的结果表明,由核苷酸理化特性指定的氨基酸极性模式很少描述基本蛋白质折叠类型,其准确度> 90%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号