首页> 外文会议>IEEE International Conference on Systems, Man, and Cybernetics;SMC >The use of support vector machine and genetic algorithms to predict protein function
【24h】

The use of support vector machine and genetic algorithms to predict protein function

机译:使用支持向量机和遗传算法预测蛋白质功能

获取原文

摘要

In Bioinformatics, the prediction of protein function is considered a very important task but also difficult. Using a set of enzymes represented by Hydrolase, Isomerase, Ligase, Lyase, Transferase and Oxidoreductase classes, previously used by Dobson et al., this paper proposes a self-learning process able to predict their classes, based on their primary and secondary structures, through a Support Vector Machine (SVM) classifier and genetic algorithm. An SVM can be characterized as a supervised machine learning algorithm capable of resolving linear and non-linear classification problems. During the learning process, both the training data and the corresponding output are presented to the SVM to allow its parameters to be adjusted. This study utilized genetic algorithms - optimization heuristics often used to estimate parameters - to adjust the main parameters of the classifier such as kernel function type and parameter C, which provides the relationship between the training error and the margin of separation between classes. In this specific prediction problem, the results indicate that the best function is an RBF where width is 6.1 and C is 6.9. Using these parameters, the classifier obtains an average accuracy of 79.74%.
机译:在生物信息学中,蛋白质功能的预测被认为是非常重要的任务,但是也很困难。利用Dobson等人先前使用的一组以水解酶,异构酶,连接酶,裂解酶,转移酶和氧化还原酶类为代表的酶,本文提出了一种自学习过程,能够根据其一级和二级结构来预测其类别,通过支持向量机(SVM)分类器和遗传算法。 SVM可以被描述为一种能够解决线性和非线性分类问题的监督机器学习算法。在学习过程中,训练数据和相应的输出都将显示给SVM,以调整其参数。这项研究利用遗传算法-通常用于估计参数的优化启发法-来调整分类器的主要参数,例如核函数类型和参数C,这提供了训练误差与类之间的间隔之间的关系。在这个特定的预测问题中,结果表明最佳功能是RBF,其中宽度为6.1,C为6.9。使用这些参数,分类器可获得79.74%的平均准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号