...
首页> 外文期刊>PLoS One >Predicting breast cancer risk using personal health data and machine learning models
【24h】

Predicting breast cancer risk using personal health data and machine learning models

机译:使用个人健康数据和机器学习模型预测乳腺癌风险

获取原文

摘要

Among women, breast cancer is a leading cause of death. Breast cancer risk predictions can inform screening and preventative actions. Previous works found that adding inputs to the widely-used Gail model improved its ability to predict breast cancer risk. However, these models used simple statistical architectures and the additional inputs were derived from costly and / or invasive procedures. By contrast, we developed machine learning models that used highly accessible personal health data to predict five-year breast cancer risk. We created machine learning models using only the Gail model inputs and models using both Gail model inputs and additional personal health data relevant to breast cancer risk. For both sets of inputs, six machine learning models were trained and evaluated on the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial data set. The area under the receiver operating characteristic curve metric quantified each model’s performance. Since this data set has a small percentage of positive breast cancer cases, we also reported sensitivity, specificity, and precision. We used Delong tests ( p 0.05) to compare the testing data set performance of each machine learning model to that of the Breast Cancer Risk Prediction Tool (BCRAT), an implementation of the Gail model. None of the machine learning models with only BCRAT inputs were significantly stronger than the BCRAT. However, the logistic regression, linear discriminant analysis, and neural network models with the broader set of inputs were all significantly stronger than the BCRAT. These results suggest that relative to the BCRAT, additional easy-to-obtain personal health inputs can improve five-year breast cancer risk prediction. Our models could be used as non-invasive and cost-effective risk stratification tools to increase early breast cancer detection and prevention, motivating both immediate actions like screening and long-term preventative measures such as hormone replacement therapy and chemoprevention.
机译:女性中,乳腺癌是死亡的主要原因。乳腺癌风险预测可以告知筛查和预防行动。以前的作品发现,为广泛使用的Gail模型添加输入改善了预测乳腺癌风险的能力。然而,这些模型使用简单的统计架构,并且额外的输入源自昂贵和/或侵入性。相比之下,我们开发了使用高度可访问的个人健康数据的机器学习模型,以预测五年的乳腺癌风险。我们只使用Gail模型输入和模型使用Gail模型输入和与乳腺癌风险相关的额外个人健康数据进行创建机器学习模型。对于两组投入,培训六种机器学习模型和评估前列腺,肺,结直肠癌和卵巢癌筛选试验数据集。接收器下的区域操作特征曲线度量量化了每个模型的性能。由于这种数据集具有较少的阳性乳腺癌病例的百分比,我们还报告了敏感性,特异性和精确度。我们使用DELONG测试(P& 0.05),将每种机器学习模型的测试数据集性能与乳腺癌风险预测工具(BCRAT)的测试进行比较,这是Gail模型的实现。只有BCRAT输入的机器学习模型都没有明显强于BCRAT。然而,Logistic回归,线性判别分析和具有更广泛输入的神经网络模型均明显强于BCRAT。这些结果表明,相对于BCRAT,额外的易于获得的个人健康投入可以改善五年的乳腺癌风险预测。我们的模型可用作非侵入性和具有成本效益的风险分层工具,以提高早期乳腺癌检测和预防,激励筛查和长期预防措施(如激素替代治疗和化学预防措施)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号