首页> 外文期刊>International journal of simulation: systems, science and technology >COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET
【24h】

COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET

机译:使用Coimbra数据集的乳腺癌预测中机器学习算法的比较

获取原文
           

摘要

In the medical field, machine learning (ML) techniques are playing a significant and growing role because of their high potential in helping health practitioners make decisions and diagnosis. This inspective research aims to review ML models that may predict breast cancer in women and to compare their performances. A number of clinical features were measured among the 116 participants in the dataset of this study including insulin, glucose, resistin, adiponectin, homeostasis model assessment (HOMA), leptin, monocyte chemoattractant protein-1 (MCP-1), along with their age and body mass index (BMI). The researchers implemented 11 classification algorithms and their variations including Logistic Regression (LR), k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Method (GBM), and Naive Bayes (NB), in the detection of breast cancer on the publicly available Coimbra Breast Cancer Dataset (CBCD). Each classifier applies a unique hyperparameter setting to perform prediction and their performances are compared in identifying breast cancer. As a conclusion of this study, Gradient Boosting (GB) machine learning algorithm is the best classifier in predicting breast cancer using the Coimbra Breast Cancer Dataset (CBCD) with an accuracy of 74.14%. k-Nearest Neighbor (kNN) classifier produces the fastest training time at 0.000598 seconds while Nonlinear Support Vector Machine (SVM) classifier gives with the fastest testing time at 0 seconds. Another conclusion of this paper is that the body mass index (BMI) is the top predictor, with 50% of the classifiers observing it as their top predictor and Glucose comes in second. This recommends that they may be a good pair of variables, which may predict breast cancer in women.
机译:在医学领域,由于机器学习(ML)技术在帮助健康从业者做出决定和诊断方面具有巨大潜力,因此发挥着越来越重要的作用。这项前瞻性研究旨在回顾可预测女性乳腺癌的ML模型并比较其表现。在该研究数据集中的116位参与者中测量了许多临床特征,包括胰岛素,葡萄糖,抵抗素,脂联素,稳态模型评估(HOMA),瘦素,单核细胞趋化蛋白1(MCP-1)以及年龄和体重指数(BMI)。研究人员实施了11种分类算法及其变体,包括逻辑回归(LR),k最近邻(kNN),支持向量机(SVM),决策树(DT),随机森林(RF),梯度提升方法(GBM),和Naive Bayes(NB),在公开可用的Coimbra乳腺癌数据集(CBCD)上检测乳腺癌。每个分类器都应用唯一的超参数设置来执行预测,并比较它们在识别乳腺癌中的表现。这项研究的结论是,梯度提升(GB)机器学习算法是使用Coimbra乳腺癌数据集(CBCD)预测乳腺癌的最佳分类器,准确性为74.14%。 k最近邻(kNN)分类器产生的最快训练时间为0.000598秒,而非线性支持向量机(SVM)分类器给出的最快训练时间为0秒。本文的另一个结论是,身体质量指数(BMI)是最高的预测指标,其中50%的分类器将其视为最高的预测指标,而葡萄糖位居第二。这建议它们可能是一对很好的变量,可以预测女性的乳腺癌。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号