最近的研究已经证实,以周期蛋白依赖性激酶4(cyclin-dependent kinase 4,CDK4)为靶点,通过CDK4的抑制剂重新建立细胞周期的调控在肿瘤靶向治疗的发展中已经成为有吸引力的方向.本文测试了三种机器学习方法,k最近邻、C4.5决策树和随机森林(random forest,RF),用于预测CDK4的抑制剂.所建这些模型都达到了令人满意的预测效果.其中,RF模型当参数 Mtry=13、jbt=255时对应的总预测精度最大,为96.65%.同时,与CDK4抑制剂最相关的25个分子描述符也被最优的RF模型挑选了出来.本文的研究表明,机器学习方法特别是RF方法,对于发现潜在的CDK4抑制剂十分有效.%Recent studies have confirmed cyclin-dependent kinase 4(CDK4)as an attractive direction in the development of targets for cancer therapy by reestablishing cell cycle control through CDK4 inhibitors.In this work,three machine learning(ML)methods, k-nearest neighbor,C4.5 decision tree and random forest(RF)in predicting CDK4 inhibitors were tested.These developed models had achieved promising prediction performance.Thereinto,when the parameters Mtrywas 13 and jbt was 255,respectively,the RF model showed the best total prediction accuracy at 96.65%.Moreover,25 molecular descriptors most relevant to the CDK4 inhibi-tors were also extracted by the best RF model.Our research suggested that ML methods,particularly RF,were highly useful for the discovery of potential CDK4 inhibitors.
展开▼