首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data
【24h】

Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data

机译:基于现实世界常规身体检查数据的癌症早期风险预测模型和判别特征选择的研究

获取原文
获取外文期刊封面目录资料

摘要

Most cancers at early stages show no obvious symptoms and curative treatment is not an option any more when cancer is diagnosed. Therefore, making accurate predictions for the risk of early cancer has become urgently necessary in the field of medicine. In this paper, our purpose is to fully utilize real-world routine physical examination data to analyze the most discriminative features of cancer based on ReliefF algorithm and generate early risk predictive model of cancer taking advantage of three machine learning (ML) algorithms. We use physical examination data with a return visit followed 1 month later derived from CiMing Health Checkup Center. The ReliefF algorithm selects the top 30 features written as Sub(30) based on weight value from our data collections consisting of 34 features and 2300 candidates. The 4-layer (2 hidden layers) deep neutral network (DNN) based on B-P algorithm, the support machine vector with the linear kernel and decision tree CART are proposed for predicting the risk of cancer by 5-fold cross validation. We implement these criteria such as predictive accuracy, AUC-ROC, sensitivity and specificity to identify the discriminative ability of three proposed method for cancer. The results show that compared with the other two methods, SVM obtains higher AUC and specificity of 0.926 and 95.27%, respectively. The superior predictive accuracy (86%) is achieved by DNN. Moreover, the fuzzy interval of threshold in DNN is proposed and the sensitivity, specificity and accuracy of DNN is 90.20%, 94.22% and 93.22%, respectively, using the revised threshold interval. The research indicates that the application of ML methods together with risk feature selection based on real-world routine physical examination data is meaningful and promising in the area of cancer prediction.
机译:早期阶段的大多数癌症都没有明显的症状,并且当诊断癌症时,疗法治疗并非任何选择。因此,对早期癌症风险的准确预测在医学领域迫切需要。在本文中,我们的目的是充分利用现实世界常规体检数据来分析基于Relieff算法的癌症最辨别特征,并产生三种机器学习(ML)算法的早期风险预测模型。我们使用返回访问的物理检查数据随访,后续源自Ciming Health Center Center。 Relieff算法根据我们的数据收集组成的34个特征和2300名候选者选择作为子(30)写入的前30个功能。基于B-P算法的4层(2个隐藏层)深中性网络(DNN),提出了具有线性核和决策树推车的支撑机向量,以预测癌症的风险5倍交叉验证。我们实施这些标准,例如预测准确性,AUC-ROC,敏感性和特异性,以确定三种提出的癌症方法的判别能力。结果表明,与其他两种方法相比,SVM分别获得较高的AUC和特异性0.926和95.27%。 DNN实现了卓越的预测精度(86%)。此外,提出了DNN中阈值的模糊间隔,DNN的敏感性,敏感性,特异性和准确性分别使用修正的阈值间隔分别为90.20%,94.22%和93.22%。该研究表明,基于现实世界常规体检数据与风险特征选择一起使用ML方法是有意义的,在癌症预测领域具有有意义的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号