首页> 外文期刊>International journal of biological sciences >GBDTCDA: Predicting circRNA-disease Associations Based on Gradient Boosting Decision Tree with Multiple Biological Data Fusion
【24h】

GBDTCDA: Predicting circRNA-disease Associations Based on Gradient Boosting Decision Tree with Multiple Biological Data Fusion

机译:GBDTCDA:预测基于多种生物数据融合的梯度升压决策树的Circrna病关联

获取原文
       

摘要

Circular RNA (circRNA) is a closed-loop structural non-coding RNA molecule which plays a significant role during the gene regulation processes. There are many previous studies shown that circRNAs can be regarded as the sponges of miRNAs. Thus, circRNA is also a key point for disease diagnosing, treating and inferring. However, traditional experimental approaches to verify the associations between the circRNA and disease are time-consuming and money-consuming. There are few computational models to predict potential circRNA-disease associations, which become our motivation to propose a new computational model. In this study, we propose a machine learning based computational model named Gradient Boosting Decision Tree with multiple biological data to predict circRNA-disease associations (GBDTCDA). The known circRNA-disease associations' data are downloaded from cricR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). The feature vector of each circRNA-disease association pair is composed of four parts, which are the statistics information of different biological networks, the graph theory information of different biological networks, circRNA-disease associations' network information and circRNA nucleotide sequence information, respectively. Therefore, we use those feature vectors to train the gradient boosting decision tree regression model. Then, the leave one out cross validation (LOOCV) is adopted to evaluate the performance of our computational model. As for predicting some common diseases related circRNAs, our method GBDTCDA also obtains the better results. The Area under the ROC Curve (AUC) values of Basal cell carcinoma, Non-small cell lung cancer and cervical cancer are 95.8%, 88.3% and 93.5%, respectively. For further illustrating the performance of GBDTCDA, a case study of breast cancer is also supplemented in this study. Thus, our proposed method GBDTCDA is a powerful tool to predict potential circRNA-disease associations based on experimental results and analyses.? The author(s).
机译:圆形RNA(CircrNA)是闭环结构非编码RNA分子,其在基因调控过程中起重要作用。有许多以前的研究表明,Circrnas可以被视为MiRNA的海绵。因此,CircrNA也是疾病诊断,治疗和推断的关键点。然而,传统的实验方法来验证Circrna和疾病之间的关联是耗时和耗金的。有很少的计算模型来预测潜在的循环疾病关联,这成为提出新的计算模型的动机。在本研究中,我们提出了一种基于机器学习的基于机器学习的计算模型,命名为具有多种生物数据的梯度提升决策树,以预测Circrna-Distress Acciations(GBDTCDA)。已知的Circrna-Discess关联'数据从CRICR2Disease数据库下载(http://bioinfo.snnu.edu.cn/circr2disease/)。每个Circrna-疾病协会对的特征载体由四个部分组成,这是不同生物网络的统计信息,不同生物网络的图解信息,Circrna疾病关联的网络信息和Circrna核苷酸序列信息分别。因此,我们使用这些特征向量来训练渐变升压决策树回归模型。然后,采用休假交叉验证(LOOCV)来评估计算模型的性能。至于预测一些相关Circrnas的一些常见疾病,我们的方法GBDTCDA也获得了更好的结果。基础细胞癌,非小细胞肺癌和宫颈癌的ROC曲线(AUC)值分别为95.8%,88.3%和93.5%。为了进一步说明GBDTCDA的性能,本研究还补充了对乳腺癌的案例研究。因此,我们提出的方法GBDTCDA是一种强大的工具,可以基于实验结果和分析来预测潜在的循环疾病关联。作者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号