首页> 外文会议>International conference on advanced data mining and applications >Comparison of Cutoff Strategies for Geometrical Features in Machine Learning-Based Scoring Functions
【24h】

Comparison of Cutoff Strategies for Geometrical Features in Machine Learning-Based Scoring Functions

机译:基于机器学习函数的几何特征截止策略的比较

获取原文
获取外文期刊封面目录资料

摘要

Countings of protein-ligand contacts are popular geometrical features in scoring functions for structure-based drug design. When extracting features, cutoff values are used to define the range of distances within which a protein-ligand atom pair is considered as in contact. But effects of the number of ranges and the choice of cutoff values on the predictive ability of scoring functions are unclear. Here, we compare five cutoff strategies (one-, two-, three-, six-range and soft boundary) with four machine learning methods. Prediction models are constructed using the latest PDBbind v2012 data sets and assessed by correlation coefficients. Our results show that the optimal one-range cutoff value lies between 6 and 8 A instead of the customary choice of 12 A. In general, two-range models have improved predictive performance in correlation coefficients by 3-5%, but introducing more cutoff ranges do not always help improving the prediction accuracy.
机译:蛋白质配体触点的计数是基于结构的药物设计的评分功能中的普遍几何特征。当提取特征时,截止值用于定义蛋白质 - 配体原子对被认为如接触的距离范围。但是,范围的影响以及截止值的选择对得分功能的预测能力尚不清楚。在这里,我们比较五种机器学习方法的五个截止策略(一张,两个,三级,六距离边界)。使用最新的PDBBind V2012数据集和通过相关系数进行评估来构建预测模型。我们的结果表明,最佳的单距离截止值在6到8 A之间,而不是惯例选择12A。通常,两种型号在相关系数中提高了3-5%的预测性能,但引入了更多的截止范围并不总是有助于提高预测准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号