...
首页> 外文期刊>Current Bioinformatics >Comparison of Kernel and Decision Tree-Based Algorithms for Prediction of MicroRNAs Associated with Cancer
【24h】

Comparison of Kernel and Decision Tree-Based Algorithms for Prediction of MicroRNAs Associated with Cancer

机译:基于内核和决策树的与癌症相关的微小RNA预测算法的比较

获取原文
获取原文并翻译 | 示例
           

摘要

The discovery of microRNAs (miRs) in the 1990's spawned a genre of research which has thrown light on the involvement of these small non-coding RNAs in several developmental pathways and diseases, one of which happens to be cancer. While algorithms which predict the binding of miRNAs to their targets are abundant, the same is not true for the association of miRNAs to targets which can be implicated in cancer. Machine learning approaches, which have been implemented in target prediction need to be extrapolated with proper feature selection to reach an acceptable level of accuracy in the prediction of associations of miRNAs to cancer. In this study we present a comparison of three different learning algorithms viz., the kernel-based Support Vector Machines (SVM), Decision Tree-based Random Forest (RF) and C4.5 to predict miRNAs associated with cancer. 60 informative features were extracted from a dataset of experimentally validated miRNA based on sequence, thermodynamics of miRNA-mRNA binding and their hybridization. Initially, features were ranked based on F- score and a two-stage Recursive Feature Elimination (RFE) process was employed to select the optimal subset of features for individual classifier. Class imbalance in the training set was overcome by employing cost-sensitive approach. The performance of each individual learning algorithm was evaluated in terms of precision, recall, F-measure and AUC. Subsequently, the learning algorithm with better performance measure would be utilized for constructing a two-step binary classifier viz., miRSEQ and miRINT, which will identify a miRNA to be associated with the cancer pathway. Based on our comparative analysis, it was evident that the decision tree based RF model performed well in terms of better precision and AUC (for miRSEQ), but was moderate (for miRINT).
机译:在1990年代,microRNA(miRs)的发现催生了一种研究流派,该流派揭示了这些小的非编码RNA在几种发育途径和疾病中的参与,其中之一恰好是癌症。尽管预测miRNA与靶标结合的算法非常丰富,但对于miRNA与可能与癌症有关的靶标的关联却并非如此。需要在目标预测中实施的机器学习方法需要通过适当的特征选择进行推断,以在miRNA与癌症关联的预测中达到可接受的准确性水平。在这项研究中,我们比较了三种不同的学习算法,即基于内核的支持向量机(SVM),基于决策树的随机森林(RF)和C4.5,以预测与癌症相关的miRNA。基于序列,miRNA-mRNA结合的热力学及其杂交,从经过实验验证的miRNA数据集中提取了60种信息特征。最初,基于F评分对特征进行排名,然后采用两阶段递归特征消除(RFE)过程为各个分类器选择最佳特征子集。培训集中的班级不平衡问题通过采用成本敏感的方法得以解决。根据准确性,召回率,F量度和AUC对每种学习算法的性能进行了评估。随后,具有更好性能的学习算法将被用于构建两步二进制分类器,即miRSEQ和miRINT,这将识别与癌症途径相关的miRNA。根据我们的比较分析,很明显,基于决策树的RF模型在更好的精度和AUC方面(对于miRSEQ)表现良好,但中等(对于miRINT)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号