首页> 外文期刊>Analytical methods >Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors
【24h】

Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors

机译:通过多标签机器学习和基于序列的描述符对多族酶进行分类

获取原文
获取原文并翻译 | 示例
           

摘要

Multi-family enzymes are of great importance in life, disease and other domains. However, in terms of the classification of enzymes, the information of multi-family enzymes is always removed from the dataset to account for the limitation of traditional single-label prediction methods. In order to predict multiple classes of multi-family enzymes, we adopted two multi-label learning algorithms, namely RAkEL-RF and MLKNN, and two types of protein descriptors, namely CTD and PseAAC, to generate four predictors, RAkEL-RF-CTD, RAkEL-RF-PseAAC, MLKNN-CTD and MLKNN-PseAAC. When the four predictors were tested on a training set with 10-fold cross validation, the overall success rates reached 97.99%, 96.07%, 96.01% and 95.31%, respectively. For the independent test set, the corresponding rates reached 97.57%, 95.03%, 95.9% and 93.9%, respectively. In conclusion, it proved the outstanding prediction capability and robustness of our predictors from the extremely smalt difference between two sets for each predictor and the relatively higher accuracy. In addition, three of seven pairs of homologous enzymes with different functions and eighteen of twenty-three distantly related enzymes with a similar family were correctly classified by the RAkEL-RF-CTD predictor. These results indicated the extensive applicability of our predictors.
机译:多族酶在生命,疾病和其他领域中非常重要。但是,就酶的分类而言,总是从数据集中删除多族酶的信息,以解决传统单标签预测方法的局限性。为了预测多种类别的多家族酶,我们采用了两种多标签学习算法,即RAkEL-RF和MLKNN,以及两种类型的蛋白质描述符,即CTD和PseAAC,来生成四种预测子,即RAkEL-RF-CTD。 ,RAkEL-RF-PseAAC,MLKNN-CTD和MLKNN-PseAAC。在具有十倍交叉验证的训练集上测试这四个预测变量时,总体成功率分别达到97.99%,96.07%,96.01%和95.31%。对于独立测试集,相应的比率分别达到97.57%,95.03%,95.9%和93.9%。综上所述,从两组预测值之间的极小差异和相对较高的准确性证明了我们的预测值具有出色的预测能力和鲁棒性。此外,RAkEL-RF-CTD预测因子正确分类了七对功能不同的同源酶中的三对和二十三个远近相关的相似家族的酶。这些结果表明我们的预测变量具有广泛的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号