首页> 外文会议>International Conference on Advanced Computer Science and information Systems >The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease
【24h】

The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease

机译:特征选择方法对印度尼西亚药用植物化合物和HIV-1蛋白酶机的基于机器学习的对接预测

获取原文

摘要

This work evaluates usage feature selection methods to reduce the number of features required to predict docking results between Indonesian medicinal plant compounds and HIV protease. Two feature selection methods, Recursive Feature Elimination (RFE) and Wrapper Method (WM), are trained with a dataset of 7,330 samples and 667 features from PubChem Bioassay and DUD-E decoys. To evaluate the selected features, a dataset of 368 Indonesian herbal chemical compounds labeled by manually docking to PDB HIV-1 protease is used to benchmark the performance of linear SVM classifier using different sets of features. Our experiments show that a set of 471 features selected by RFE and 249 by WM achieve a reduction of classification time by 4.0 and 8.2 seconds respectively. Although the accuracy and sensitivity are also increased by 8% and 16%, no meaningful improvement observed for precision and specificity.
机译:这项工作评估了使用特征选择方法,以减少预测印度尼西亚药用植物化合物和HIV蛋白酶之间的对接结果所需的特征数量。两个特征选择方法,递归特征消除(RFE)和包装方法(WM)培训,数据集具有7,330个样本的数据集和来自Pubchem Bioassay和DUD-E诱饵的667个特征。为了评估所选择的特征,通过手动对接至PDB HIV-1蛋白酶标记的368印度尼西亚草药化学化合物的数据集用于使用不同的特征集基准线性SVM分类器的性能。我们的实验表明,RFE选择的一组471个特征和249通过WM分别降低了分类时间4.0和8.2秒。虽然精度和敏感性也增加了8%和16%,但对于精度和特异性没有观察到的有意义的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号