The feature selection algorithm based on the combination of random forest and game theory was proposeed in this paper as noise and redundant information in the near infrared spectroscopy would lead to the low recognition rate of a model.This algorithm was first used to measure the feature significance according to the random forest and select some features related to classification,then compute the weights of selected characters by using the improved Shapley values and mutual information computed to remove redundant information from the weighted feature set and get the optimal feature subset.To validate effectiveness of this algorithm,the tobacco leaf production area identification model was established.The experimental results indicated that the algorithm proposed in this paper had a good recognition on the area of tobacco leaf production with a recognition rate of 95.88%.%针对近红外光谱中的噪声和冗余信息导致分类模型识别率低的问题,提出了随机森林结合博弈论的特征选择算法.该算法首先根据随机森林对特征重要性进行度量,优选出对分类具有一定相关性的特征;然后利用改进的夏普利值结合互信息计算优选特征的权重,从加权后的特征集合中去掉冗余得到最优特征子集.为了验证算法的有效性,将其应用于烟叶产地识别模型,实验结果表明,该文所提出的特征选择算法对烟叶产地识别效果较好,分类识别率可达95.88%.
展开▼