COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET

Yolanda D Austria; Marie Luvett Goh; Lorenzo Sta. Maria Jr.; Jay-Ar Lalata; Joselito Eduard Goh; Heintjie Vicente

首页> 外文期刊>International journal of simulation: systems, science and technology >COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET

【24h】

COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET

机译：使用Coimbra数据集的乳腺癌预测中机器学习算法的比较

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the medical field, machine learning (ML) techniques are playing a significant and growing role because of their high potential in helping health practitioners make decisions and diagnosis. This inspective research aims to review ML models that may predict breast cancer in women and to compare their performances. A number of clinical features were measured among the 116 participants in the dataset of this study including insulin, glucose, resistin, adiponectin, homeostasis model assessment (HOMA), leptin, monocyte chemoattractant protein-1 (MCP-1), along with their age and body mass index (BMI). The researchers implemented 11 classification algorithms and their variations including Logistic Regression (LR), k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Method (GBM), and Naive Bayes (NB), in the detection of breast cancer on the publicly available Coimbra Breast Cancer Dataset (CBCD). Each classifier applies a unique hyperparameter setting to perform prediction and their performances are compared in identifying breast cancer. As a conclusion of this study, Gradient Boosting (GB) machine learning algorithm is the best classifier in predicting breast cancer using the Coimbra Breast Cancer Dataset (CBCD) with an accuracy of 74.14%. k-Nearest Neighbor (kNN) classifier produces the fastest training time at 0.000598 seconds while Nonlinear Support Vector Machine (SVM) classifier gives with the fastest testing time at 0 seconds. Another conclusion of this paper is that the body mass index (BMI) is the top predictor, with 50% of the classifiers observing it as their top predictor and Glucose comes in second. This recommends that they may be a good pair of variables, which may predict breast cancer in women.

机译：在医学领域，由于机器学习（ML）技术在帮助健康从业者做出决定和诊断方面具有巨大潜力，因此发挥着越来越重要的作用。这项前瞻性研究旨在回顾可预测女性乳腺癌的ML模型并比较其表现。在该研究数据集中的116位参与者中测量了许多临床特征，包括胰岛素，葡萄糖，抵抗素，脂联素，稳态模型评估（HOMA），瘦素，单核细胞趋化蛋白1（MCP-1）以及年龄和体重指数（BMI）。研究人员实施了11种分类算法及其变体，包括逻辑回归（LR），k最近邻（kNN），支持向量机（SVM），决策树（DT），随机森林（RF），梯度提升方法（GBM），和Naive Bayes（NB），在公开可用的Coimbra乳腺癌数据集（CBCD）上检测乳腺癌。每个分类器都应用唯一的超参数设置来执行预测，并比较它们在识别乳腺癌中的表现。这项研究的结论是，梯度提升（GB）机器学习算法是使用Coimbra乳腺癌数据集（CBCD）预测乳腺癌的最佳分类器，准确性为74.14％。 k最近邻（kNN）分类器产生的最快训练时间为0.000598秒，而非线性支持向量机（SVM）分类器给出的最快训练时间为0秒。本文的另一个结论是，身体质量指数（BMI）是最高的预测指标，其中50％的分类器将其视为最高的预测指标，而葡萄糖位居第二。这建议它们可能是一对很好的变量，可以预测女性的乳腺癌。

著录项

来源
《International journal of simulation: systems, science and technology》 |2019年第2期|共8页
作者
Yolanda D Austria; Marie Luvett Goh; Lorenzo Sta. Maria Jr.; Jay-Ar Lalata; Joselito Eduard Goh; Heintjie Vicente;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
breast cancermachine learning algorithmclassifierLogistic Regression (LR)k-Nearest Neighbor (kNN)Support VectorMachine (SVM)Decision Tree (DT)Random Forest (RF)Gradient Boosting Method (GBM)Naive Bayes (NB);

机译：乳腺癌机器学习算法分类器Logistic回归（LR）k最近邻（kNN）支持向量机（SVM）决策树（DT）随机森林（RF）梯度提升方法（GBM）朴素贝叶斯（NB）;

相似文献

外文文献
中文文献
专利

1. Machine learning algorithms, bull genetic information, and imbalanced datasets used in abortion incidence prediction models for Iranian Holstein dairy cattle [J] . Preventive Veterinary Medicine . 2020,第期

机译：用于伊朗霍尔斯坦奶牛堕胎发病率预测模型的机器学习算法，牛基因信息和不平衡数据集
2. Machine Learning Algorithms For Breast Cancer Prediction And Diagnosis [J] . Mohammed Amine Naji, Sanaa El Filali, Kawtar Aarika, Procedia Computer Science . 2021,第a期

机译：乳腺癌预测和诊断机器学习算法
3. Prediction of CDK inhibitor efficacy in ER+/HER2-breast cancer using machine learning algorithms [J] . Intelligence: A Multidisciplinary Journal . 2020,第期

机译：使用机器学习算法预测ER + / HER2-乳腺癌中的CDK抑制剂功效
4. Classification of Benign and Malignant Breast Cancer using Supervised Machine Learning Algorithms Based on Image and Numeric Datasets [C] . Ratula Ray, Azian Azamimi Abdullah, Debasish Kumar Mallick, International Conference on Biomedical Engineering . 2020

机译：基于图像和数字数据集的监督机器学习算法对良性和恶性乳腺癌的分类
5. Active learning with support vector machines for imbalanced datasets and a method for stopping active learning based on stabilizing predictions. [D] . Bloodgood, Michael. 2009

机译：支持向量机用于不平衡数据集的主动学习，以及一种基于稳定预测的主动学习停止方法。
6. Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol [O] . Zakia Salod, Yashik Singh 2019

机译：机器学习算法在乳腺癌筛查和检测中的性能比较：协议
7. A Comparison Framework of Machine Learning Algorithms for Mixed-Type Variables Datasets: A Case Study on Tire-Performances Prediction [O] . Leonardo Gutierrez-Gomez, Frank Petry, Djamel Khadraoui 2020

机译：混合型变量数据集的机器学习算法比较框架：轮胎性能预测的案例研究

COMPARISON OF MACHINE LEARNING ALGORITHMS IN BREAST CANCER PREDICTION USING THE COIMBRA DATASET

摘要

著录项

相似文献

相关主题

期刊订阅