Purpose This study aims to develop and validate a risk score to predict the occurrence of DKD in individuals with type 2 diabetes using a machine learning (ML) approach. Methods By implementing Recursive Feature Elimination with Cross-Validation (RFECV) and RFE on the Diabetes Clinic of Imam Khomeini Hospital Complex (IKHC) dataset, the most critical features were identified. These features were used in the multivariate logistic regression (LR) analysis, and the discrimination and calibration of the model were evaluated. Finally, external validation of the model was assessed. Results The development dataset included 1907 type 2 diabetic patients, 763 of whom developed DKD over 5 years. The predictive model performed well in the development dataset by implementing RFECV with the RF algorithm and considering six features (AUC: 79). Using these features, the LR-based risk score indicated appropriate discrimination (AUC: 75.5, 95 CI 73-78) and acceptable calibration (chi(2)=7.44; p value = 0.49). This risk score was then used for 1543 diabetic patients in the validation dataset, including 633 patients with DKD over 5 years. The results showed sufficient discrimination (AUC: 75.8, 95 CI 73-78) of the risk score in the validation dataset. Conclusions We developed and validated a new risk score for predicting DKD via ML approach, which used common features in the periodic screening of type 2 diabetic patients that are readily available. In addition, a web-based online tool that is readily available to the public was developed to calculate the DKD risk score.
展开▼