Many machine learning approaches have been proposed to establish the chronic gastritis diagnostic models. But till now, most of the machine-learning classifiers do not give any insight as to which features play key roles with respect to the derived classifier as well as the individual class. Recently, the variables importance measure yielded by random forest (RF) has been proposed in many applications. However, in multi-label classifications RF attempts to yield a common feature ranking for all classes, which fail in identifying the distinct predictive structures for individual class. This paper developed an improved random forest variables importance measure to evaluate the importance of features according to each individual class in multi-classification problem, and then applied a wrapper method for feature selection to construct the key features sets referring to each subtype of the chronic gastritis. Experiment results show that, compared with the previous studies, the selected features are more close to expert knowledge and contribute to better understanding of the underlying process that characterize the chronic gastritis.
展开▼