首页> 外文期刊>Journal of molecular graphics & modelling >QSAR modeling of peptide biological activity by coupling support vector machine with particle swarm optimization algorithm and genetic algorithm
【24h】

QSAR modeling of peptide biological activity by coupling support vector machine with particle swarm optimization algorithm and genetic algorithm

机译:结合支持向量机与粒子群算法和遗传算法的肽类生物活性QSAR建模

获取原文
获取原文并翻译 | 示例
           

摘要

A novel method coupling particle swarm optimization algorithm (PSO) and genetic algorithm (GA) was proposed to optimize simultaneously the kernel parameters of support vector machine (SVM) and determine the optimized features subset. By coupling GA with PSO, the particles produced in each generation in PSO algorithm were processed by crossover and mutation of GA, and then the particles could keep diversity to escape from local optima and find the global optima quickly and accurately. In order to evaluate the proposed method, four peptide datasets were employed for the investigation of quantitative structure-activity relationship (QSAR). The structural and physicochemical features of peptides from amino acid sequences were used to represent peptides for QSAR. The correlation coefficients (R) of training set of the four datasets were 1.0000, 0.9508, 1.0000, 0.9995, the R of test set of the four datasets were 0.9922, 0.9687, 0.9022, 0.7404, respectively. The root-mean-square errors (RMSEs) of training set of the four datasets were 0.0000, 0.0986, 0.0000, 0.0203, the RMSEs of test set of the four datasets were 0.2522, 0.2782, 0.9625, 0.2928, respectively. A protein dataset, which consists of 277 proteins, was also employed to evaluate the current method for predicting protein structural class, and the good results of overall success rate were obtained. The results indicated that the proposed method might hold a high potential to become a useful tool in peptide QSAR and protein prediction research.
机译:提出了一种结合粒子群优化算法和遗传算法的遗传算法,以同时优化支持向量机的核参数,确定优化后的特征子集。通过将遗传算法与粒子群算法相结合,可以对遗传算法的交叉和变异进行处理,从而生成粒子群算法中每一代产生的粒子,从而可以保持粒子多样性,以逃避局部最优,快速,准确地找到全局最优。为了评估该方法,采用了四个肽数据集来研究定量构效关系(QSAR)。来自氨基酸序列的肽的结构和物理化学特征被用来代表用于QSAR的肽。四个数据集的训练集的相关系数(R)分别为1.0000、0.9508、1.0000、0.9995,四个数据集的测试集的相关系数R分别为0.9922、0.9687、0.9022、0.7404。四个数据集的训练集的均方根误差(RMSE)为0.0000、0.0986、0.0000、0.0203,四个数据集的测试集的均方根误差分别为0.2522、0.2782、0.9625、0.2928。还使用了包含277种蛋白质的蛋白质数据集来评估当前预测蛋白质结构类别的方法,并获得了总体成功率的良好结果。结果表明,该方法具有广阔的应用前景,有望成为多肽QSAR和蛋白质预测研究的有用工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号