首页> 外文会议>IEEE International Symposium on Bioinformatics and Bioengineering >Multiclass Fuzzy Clustering Support Vector Machines for Protein Local Structure Prediction
【24h】

Multiclass Fuzzy Clustering Support Vector Machines for Protein Local Structure Prediction

机译:多种模糊聚类蛋白质局部结构预测的支持载体

获取原文
获取外文期刊封面目录资料

摘要

Local protein structure prediction is a central task in bioinformatics research. Local protein structure prediction can be transformed into the multiclass problem for huge datasets. In previous study, multiclass Clustering Support Vector Machines (CSVMs) was proposed for local protein structure prediction. The greedy algorithm is utilized to select the next closest class if CSVM modeled for the assigned class predicts the sequence segment as negative. However, the greedy algorithm may not be optimal. If all CSVM predict the sequence segment as negative, this sequence segment cannot be classified. In order to further improve performance of the multiclass problem, we propose Fuzzy Clustering Support Vector Machines (FCSVMs) in this study. The FCSVMs model calculates the class membership value of the given sequence segment for each class and assigns the representative structure of the finally selected class to the sequence segment. Values of the fuzzy membership function are based on testing accuracy of decision function outputs from FCSVMs. Under this mechanism, values of different fuzzy membership functions can be compared. FCSVMs are built specifically for each class partitioned intelligently by the clustering algorithm. This feature makes learning tasks for each FCSVM more specific and simpler. Furthermore, FCSVM modeled for each class can be easily parallelized to handle the complex multiclass problems for huge datasets. Using fuzzy membership functions, all sequence segments can be classified. Compared with the conventional clustering algorithm and CSVMs, testing accuracy for local structure prediction has been improved noticeably when the FCSVMs model is applied.
机译:局部蛋白质结构预测是生物信息学研究中的中央任务。局部蛋白质结构预测可以转换为巨大数据集的多字节问题。在先前的研究中,提出了用于局部蛋白质结构预测的多烷类聚类支持向量机(CSVMS)。如果为分配的类建模的CSVM预测序列段为否定,则使用贪婪算法选择下一个最接近的类。但是,贪婪算法可能不是最佳的。如果所有CSVM都将序列段预测为否定,则不能对此序列段进行分类。为了进一步提高多字母问题的性能,我们提出了本研究中的模糊聚类支持向量机(FCSVMS)。 FCSVMS模型计算每个类的给定序列段的类成员身份值,并将最终选择的类的代表性结构分配给序列段。模糊隶属函数的值是基于FCSVMS的决策功能输出的测试精度。在这种机制下,可以比较不同模糊隶属函数的值。 FCSVMS专门为群集算法智能地划分的每个类构建。此功能使每个FCSVM更具体更简单的学习任务。此外,为每个类建模的FCSVM可以容易地并行化以处理巨大数据集的复杂多字节问题。使用模糊会员函数,可以对所有序列段进行分类。与传统的聚类算法和CSVM相比,当应用FCSVMS模型时,显着提高了局部结构预测的测试精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号