首页> 外文期刊>Health and technology. >Breast cancer classification along with feature prioritization using machine learning algorithms
【24h】

Breast cancer classification along with feature prioritization using machine learning algorithms

机译:Breast cancer classification along with feature prioritization using machine learning algorithms

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose Breast Cancer (BC) is considered one of the lethal diseases that causes a large number of female deaths around the world. Prevention and diagnosis are the best options to reduce cancer death, which can be performed through regular examination of a few health-related issues such as the level of Glucose, Insulin, HOMA, Leptin, etc. Based on a few such kinds of statistics, this work classifies Breast Cancer patients and non-Breast Cancer patients utilizing state-of-the-art Machine Learning (ML) techniques. In this study, we have classified the BC using state-of-the-art ML techniques and analyzed the features that influence the model to predict a certain class. Methods We have used several Machine Learning (ML) models such as Gradient Boosting (GB), XGBoost (XGB), CatBoost (CB), and Light Gradient Boosting Machine (LGBM) to classify the BC and find the feature importance. To interpret the ML model and find the feature contribution to the prediction of the BC, we have used the Shapley Additive exPlanation (SHAP). Besides, a few filters and wrapper-based feature selection and prioritization algorithms have been used to sort out the priority of the features. To obtain conclusive remarks based on a democratic manner, we have utilized the traditional Borda method. Results It shows that Gradient Boosting (GB) methods provide the best performances among the selected gradient-based algorithms with 82.85 accuracy, 80.00 precision, 88.89 recall, and 84.21 F1-Score, respectively. It shows that different algorithms provide different precedence of the features. We have utilized the traditional Borda method, which has concluded that Glucose is the most influential parameter for Breast Cancer and non-Breast Cancer patients' selection. Conclusion In this study, we have classified the BC and found that the GB classifier achieved the highest accuracy among CB. XGB, and LGBM classifier. Using the feature selection technique, SHAP, and Borda method we have found that Glucose is the most influential parameter for the detection of BC. We have also presented and analyzed the samples that were misclassified by the GB classifier.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号