首页> 外文会议>International Conference on Science and Applied Science >Classification analysis using support vector machine, decision tree, and neural network with principal component analysis to determine molecular structure relationship from its biological activity on dipeptidyl peptidase IV inhibitors
【24h】

Classification analysis using support vector machine, decision tree, and neural network with principal component analysis to determine molecular structure relationship from its biological activity on dipeptidyl peptidase IV inhibitors

机译:使用支撑向量机,决策树和神经网络具有主成分分析的分类分析,从而确定二肽基肽酶IV抑制剂的生物活性分子结构关系

获取原文

摘要

A chronic metabolic disease that often affects adults is type 2 diabetes. Dipeptidyl peptidase-IV (DPP-IV) inhibitors are drug targets for diabetes mellitus type 2 (T2DM) that can block the enzyme dipeptidyl peptidase-IV. At this time, there are adverse effects from these inhibitors. Therefore, novel DPP-IV inhibitors are still expected with minimal adverse effects. In this paper, a machine learning approach is used to predict the molecular structure of DPP-IV inhibitors. There are 3363 inhibitors consisting of 1849 inhibitors with active labels and 1514 inhibitors with inactive labels that are optimized using fingerprint topology as descriptors. However, fingerprint topology always produces high-dimensional data. So, the principal component analysis method is proposed to reduce the dimension of the data set. Then, support vector machine, decision tree, and neural network are used for classifying DPP-IV inhibitors. The overall classification using the support vector machine method produces specificity, sensitivity, accuracy, and Matthews coefficient correlation C, respectively 0.774, 0.826, 0.803, and 0.604. These results indicate that the support vector machine method has a good ability in the classification of active and inactive DPP-IV inhibitors based on topological fingerprint as descriptors.
机译:往往影响成年人的慢性代谢疾病是2型糖尿病。二肽肽肽酶-4- inv(DPP-IV)抑制剂是糖尿病的药物靶标2(T2DM),其可阻断酶二肽基肽酶-4。此时,这些抑制剂存在不利影响。因此,预计新的DPP-IV抑制剂仍然存在最小的不良反应。本文使用机器学习方法来预测DPP-IV抑制剂的分子结构。有3363个抑制剂,由1849个抑制剂组成,具有有源标签和1514个抑制剂,其具有非活动标签,其使用指纹拓扑作为描述符进行优化。但是,指纹拓扑总是产生高维数据。因此,提出了主成分分析方法来减少数据集的维度。然后,支持向量机,决策树和神经网络用于对DPP-IV抑制剂进行分类。使用支持向量机方法的整体分类产生特异性,灵敏度,准确性和马修系数相关性C,分别为0.774,0.826,0.803和0.604。这些结果表明,基于拓扑指纹作为描述符,支持向量机方法具有良好的活性和无活性DPP-IV抑制剂的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号