Clustering support vector machines for protein local structure prediction

Wei Zhong; Jieyue He; Robert Harrison; Phang C. Tai; Yi Pan

首页> 外文期刊>Expert Systems with Application >Clustering support vector machines for protein local structure prediction

【24h】

Clustering support vector machines for protein local structure prediction

机译：聚类支持向量机用于蛋白质局部结构预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Understanding the sequence-to-structure relationship is a central task in bioinformatics research. Adequate knowledge about this relationship can potentially improve accuracy for local protein structure prediction. One of approaches for protein local structure prediction uses the conventional clustering algorithms to capture the sequence-to-structure relationship. The cluster membership function defined by conventional clustering algorithms may not reveal the complex nonlinear relationship adequately. Compared with the conventional clustering algorithms, Support Vector Machine (SVM) can capture the nonlinear sequence-to-structure relationship by mapping the input space into another higher dimensional feature space. However, SVM is not favorable for huge datasets including millions of samples. Therefore, we propose a novel computational model called Clustering Support Vector Machines (CSVMs). Taking advantage of both theory of granular computing and advanced statistical learning methodology, CSVMs are built specifically for each information granule partitioned intelligently by the clustering algorithm. This feature makes learning tasks for each CSVM more specific and simpler. CSVMs modeled for each granule can be easily parallelized so that CSVMs can be used to handle complex classification problems for huge datasets. Average accuracy for CSVMs is over 80%, which indicates that the generalization power for CSVMs is strong enough to recognize the complicated pattern of sequence-to-structure relationships. Compared with the conventional clustering algorithm, our experimental results show that accuracy for local structure prediction has been improved noticeably when CSVMs are applied.

机译：了解序列与结构的关系是生物信息学研究的中心任务。有关此关系的足够知识可以潜在地提高局部蛋白质结构预测的准确性。蛋白质局部结构预测的方法之一是使用常规聚类算法来捕获序列与结构的关系。传统聚类算法定义的聚类隶属度函数可能无法充分揭示复杂的非线性关系。与传统的聚类算法相比，支持向量机（SVM）可以通过将输入空间映射到另一个高维特征空间来捕获非线性序列与结构的关系。但是，SVM不适用于包含数百万个样本的庞大数据集。因此，我们提出了一种新颖的计算模型，称为聚类支持向量机（CSVM）。利用粒计算理论和先进的统计学习方法，针对通过聚类算法智能划分的每个信息颗粒专门构建CSVM。此功能使每个CSVM的学习任务更加具体和简单。为每个颗粒建模的CSVM可以轻松并行化，因此CSVM可以用于处理庞大数据集的复杂分类问题。 CSVM的平均准确性超过80％，这表明CSVM的泛化能力足以识别序列与结构之间关系的复杂模式。与传统的聚类算法相比，我们的实验结果表明，当使用CSVM时，局部结构预测的准确性已得到显着提高。

著录项

来源
《Expert Systems with Application》 |2007年第2期|p.518-526|共9页
作者
Wei Zhong; Jieyue He; Robert Harrison; Phang C. Tai; Yi Pan;
展开▼
作者单位

Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
clustering algorithm; SVM (support vector machine); protein structure prediction; granular computing;

机译：聚类算法;SVM（支持向量机）;蛋白质结构预测;粒度计算;

相似文献

外文文献
中文文献
专利

1. Multi-level clustering support vector machine trees for improved protein local structure prediction [J] . Wei Zhong, Jieyue He, Xiujuan Chen, International journal of data mining and bioinformatics . 2014,第2期

机译：多级聚类支持向量机树，用于改进蛋白质局部结构预测
2. StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine [J] . Wengang Zhou, Julie A. Dickerson International journal of data mining and bioinformatics . 2012,第2期

机译：StruLocPred：使用多类支持向量机的基于结构的蛋白质亚细胞定位预测
3. Protein local 3D structure prediction by Super Granule Support Vector Machines (Super GSVM) [J] . Bernard Chen, Matthew Johnson BMC Bioinformatics . 2009,第SUPPLEMENTa11期

机译：超级颗粒支持向量机（Super GSVM）预测蛋白质局部3D结构
4. Multiclass Fuzzy Clustering Support Vector Machines for Protein Local Structure Prediction [C] . Wei Zhong, Jieyue He, Yi Pan IEEE International Symposium on Bioinformatics and Bioengineering . 2007

机译：多种模糊聚类蛋白质局部结构预测的支持载体
5. Clustering system and clustering support vector machine for local protein structure prediction. [D] . Zhong, Wei. 2006

机译：用于局部蛋白质结构预测的聚类系统和聚类支持向量机。
6. Protein local 3D structure prediction by Super Granule Support Vector Machines (Super GSVM) [O] . Bernard Chen, Matthew Johnson 2009

机译：超级颗粒支持向量机（Super GSVM）预测蛋白质局部3D结构
7. Clustering Support Vector Machines and Its Application to Local Protein Tertiary Structure Prediction [O] . Jieyue He, Wei Zhong, Robert Harrison, 2006

机译：聚类支持向量机及其在局部蛋白质三级结构预测中的应用

Clustering support vector machines for protein local structure prediction

摘要

著录项

相似文献

相关主题

期刊订阅