首页> 外文会议>International Conference on Advanced Computer Science and information Systems >The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease
【24h】

The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease

机译:特征选择方法对基于机器学习的印尼药用植物化合物和HIV-1蛋白酶对接预测的影响

获取原文

摘要

This work evaluates usage feature selection methods to reduce the number of features required to predict docking results between Indonesian medicinal plant compounds and HIV protease. Two feature selection methods, Recursive Feature Elimination (RFE) and Wrapper Method (WM), are trained with a dataset of 7,330 samples and 667 features from PubChem Bioassay and DUD-E decoys. To evaluate the selected features, a dataset of 368 Indonesian herbal chemical compounds labeled by manually docking to PDB HIV-1 protease is used to benchmark the performance of linear SVM classifier using different sets of features. Our experiments show that a set of 471 features selected by RFE and 249 by WM achieve a reduction of classification time by 4.0 and 8.2 seconds respectively. Although the accuracy and sensitivity are also increased by 8% and 16%, no meaningful improvement observed for precision and specificity.
机译:这项工作评估了使用特征选择方法,以减少预测印度尼西亚药用植物化合物与HIV蛋白酶对接结果所需的特征数量。使用来自PubChem Bioassay和DUD-E诱饵的7,330个样本和667个特征的数据集训练了两种特征选择方法(递归特征消除(RFE)和包装器方法(WM))。为了评估所选功能,使用了通过手动对接至PDB HIV-1蛋白酶标记的368种印度尼西亚草药化合物的数据集,以使用不同的功能集对线性SVM分类器的性能进行基准测试。我们的实验表明,RFE选择的471个特征集和WM选择的249个特征集分别将分类时间减少了4.0和8.2秒。尽管准确性和灵敏度也分别提高了8%和16%,但在精度和特异性上均未见有意义的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号