Accurate classification of lung nodules can lead to a more favorable diagnosis and treatment for lung cancer. In this study, an accurate classification of nodules and non-nodules based on radiomics and machine learning algorithms has been presented. We validate our method on 4999 nodule candidates and use accuracy, recall, precision, f1-score, and the area under receiver operating characteristic curve (AUC) as evaluation metrics. Experimental results manifest that for most classifiers, recursive feature elimination (RFE) has higher AUC values than chi-square test and principal component analysis, selecting 15 features is better in AUC than 10 features and 20 features and the ratio between training data set and testing data set of 9:1 has the best predictive performance. When the feature selection method is RFE, the number of features is 15 and the ratio is 9:1, random forest has the highest AUC value (0.9536), accuracy (0.9580), recall (0.9893), precision (0.9392), and f1-score (0.9636).
展开▼