首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set
【24h】

Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set

机译:使用小分子数据集的模糊ARTMAP预测潜在HIV-1蛋白酶抑制剂的生物活性

获取原文
获取原文并翻译 | 示例

摘要

Obtaining satisfactory results with neural networks depends on the availability of large data samples. The use of small training sets generally reduces performance. Most classical Quantitative Structure-Activity Relationship (QSAR) studies for a specific enzyme system have been performed on small data sets. We focus on the neuro-fuzzy prediction of biological activities of HIV-1 protease inhibitory compounds when inferring from small training sets. We propose two computational intelligence prediction techniques which are suitable for small training sets, at the expense of some computational overhead. Both techniques are based on the FAMR model. The FAMR is a Fuzzy ARTMAP (FAM) incremental learning system used for classification and probability estimation. During the learning phase, each sample pair is assigned a relevance factor proportional to the importance of that pair. The two proposed algorithms in this paper are: 1) The GA-FAMR algorithm, which is new, consists of two stages: a) During the first stage, we use a genetic algorithm (GA) to optimize the relevances assigned to the training data. This improves the generalization capability of the FAMR. b) In the second stage, we use the optimized relevances to train the FAMR. 2) The Ordered FAMR is derived from a known algorithm. Instead of optimizing relevances, it optimizes the order of data presentation using the algorithm of Dagher et al. In our experiments, we compare these two algorithms with an algorithm not based on the FAM, the FS-GA-FNN introduced in . We conclude that when inferring from small training sets, both techniques are efficient, in terms of generalization capability and execution time. The computational overhead introduced is compensated by better accuracy. Finally, the proposed techniques are used to predict the biological activities of newly designed potential HIV-1 protease inhibitors.
机译:用神经网络获得满意的结果取决于大数据样本的可用性。使用小型训练集通常会降低性能。针对特定酶系统的大多数经典定量结构-活性关系(QSAR)研究均在小型数据集上进行。从小型训练集推断时,我们专注于HIV-1蛋白酶抑制化合物的生物学活性的神经模糊预测。我们提出了两种适用于小型训练集的计算智能预测技术,但会消耗一些计算开销。两种技术均基于FAMR模型。 FAMR是用于分类和概率估计的Fuzzy ARTMAP(FAM)增量学习系统。在学习阶段,每个样本对都被分配一个与该样本对的重要性成正比的相关因子。本文中提出的两种算法是:1)新的GA-FAMR算法包括两个阶段:a)在第一阶段,我们使用遗传算法(GA)优化分配给训练数据的相关性。这提高了FAMR的泛化能力。 b)在第二阶段,我们使用优化的相关性来训练FAMR。 2)有序FAMR是从已知算法得出的。与其优化相关性,不如使用Dagher等人的算法优化数据表示的顺序。在我们的实验中,我们将这两种算法与不基于FAM的算法(即引入的FS-GA-FNN)进行了比较。我们得出的结论是,从小型训练集推断时,两种技术在泛化能力和执行时间方面都是有效的。引入的计算开销可以通过更好的精度来补偿。最后,提出的技术用于预测新设计的潜在HIV-1蛋白酶抑制剂的生物学活性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号