首页> 外文期刊>BMC proceedings. >Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms
【24h】

Classification models for clear cell renal carcinoma stage progression, based on tumor RNAseq expression trained supervised machine learning algorithms

机译:基于肿瘤RNAseq表达训练的监督机器学习算法的透明细胞肾癌分期进展分类模型

获取原文
           

摘要

Background Clear-cell Renal Cell Carcinoma (ccRCC) is the most- prevalent, chemotherapy resistant and lethal adult kidney cancer. There is a need for novel diagnostic and prognostic biomarkers for ccRCC, due to its heterogeneous molecular profiles and asymptomatic early stage. This study aims to develop classification models to distinguish early stage and late stage of ccRCC based on gene expression profiles. We employed supervised learning algorithms- J48, Random Forest, SMO and Na?ve Bayes; with enriched model learning by fast correlation based feature selection to develop classification models trained on sequencing based gene expression data of RNA seq experiments, obtained from The Cancer Genome Atlas. Results Different models developed in the study were evaluated on the basis of 10 fold cross validations and independent dataset testing. Random Forest based prediction model performed best amongst the models developed in the study, with a sensitivity of 89%, accuracy of 77% and area under Receivers Operating Curve of 0.8. Conclusions We anticipate that the prioritized subset of 62 genes and prediction models developed in this study will aid experimental oncologists to expedite understanding of the molecular mechanisms of stage progression and discovery of prognostic factors for ccRCC tumors.
机译:背景技术透明细胞肾细胞癌(ccRCC)是最普遍,对化疗具有抵抗力和致死性的成年肾癌。由于ccRCC具有异质的分子特征和无症状的早期阶段,因此需要新颖的ccRCC诊断和预后生物标志物。这项研究旨在开发基于基因表达谱区分ccRCC早期和晚期的分类模型。我们采用了监督学习算法-J48,Random Forest,SMO和Naveve Bayes;通过基于快速相关性的特征选择进行丰富的模型学习,以开发分类模型,该模型基于从RNA序列实验获得的基于测序的基因表达数据进行训练,该模型得自The Cancer Genome Atlas。结果在10倍交叉验证和独立数据集测试的基础上,评估了研究中开发的不同模型。在研究开发的模型中,基于随机森林的预测模型表现最佳,灵敏度为89%,准确度为77%,接收器工作曲线下的面积为0.8。结论我们预期在这项研究中开发的62个基因和预测模型的优先子集将帮助实验肿瘤学家加快对ccRCC肿瘤分期进展的分子机制和发现预后因素的了解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号