首页> 外文会议>International Conference on Computational Scinece and Its Applications(ICCSA 2005) pt.3; 20050509-12; Singapore(SG) >SVM Classification to Predict Two Stranded Anti-parallel Coiled Coils Based on Protein Sequence Data
【24h】

SVM Classification to Predict Two Stranded Anti-parallel Coiled Coils Based on Protein Sequence Data

机译:支持向量机分类基于蛋白质序列数据预测两股反平行卷曲螺旋

获取原文
获取原文并翻译 | 示例

摘要

Coiled coils is an important 3-D protein structure with two or more stranded alpha-helical motif wounded around to form a "knobs-into-holes" structure. In this paper we propose an SVM classification approach to predict the two stranded anti-parallel coiled coils structure based on the primary amino acid sequence. The training dataset for the machine learning are collected from SOCKET database which is a SOCKET algorithm predicted coiled coils database. Total 41 sequences of at least two heptad repeats of the two stranded anti-parallel coiled coils motif are extracted from 12 proteins as the positive data-sets. Total 37 of non coiled coils sequences and two stranded parallel coiled coils motif are extracted from 5 proteins as negative datasets. The normalized positional weight matrix on each heptad register a, b, c, d, e, f and g is from SOCKET database and is used to generate the positional weight on each entry. We performed SVM classification using the cross-validated datasets as training and testing groups. Our result shows 73% accuracy on the prediction of two stranded anti-parallel coiled coils based on the cross-validated data. The result suggests a useful approach of using SVM to classify the two stranded anti-parallel coiled coils based on the primary amino acid sequence.
机译:盘绕线圈是一种重要的3-D蛋白结构,具有缠绕的两个或多个绞合的α-螺旋基序,以形成“旋钮入孔”结构。在本文中,我们提出了一种基于原始氨基酸序列的SVM分类方法来预测两股反平行卷曲线圈结构。机器学习的训练数据集是从SOCKET数据库收集的,该数据库是SOCKET算法预测的线圈卷数据库。从12种蛋白质中提取两个链反平行卷曲螺旋基序的至少两个七肽重复序列的总共41个序列,作为阳性数据集。从5种蛋白质中提取了总共37个非卷曲螺旋序列和两个绞合的平行卷曲螺旋基序作为阴性数据集。每个七进制寄存器a,b,c,d,e,f和g上的归一化位置权重矩阵来自SOCKET数据库,用于在每个条目上生成位置权重。我们使用交叉验证的数据集作为训练和测试组进行了SVM分类。我们的结果显示,根据交叉验证的数据,两股反平行缠绕线圈的预测精度为73%。该结果提出了一种有用的方法,该方法使用SVM根据一级氨基酸序列对两股反平行卷曲线圈进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号