To solve the multi-class imbalance problem in Internet traffic classification,this paper proposed a new hybrid feature selection approach based on relative uncertainty and symmetric uncertainty.Firstly,it used the relative uncertainty value to select candidate feature subset for each class.Then,for each candidate feature subset,it preserved the features with high symmetric uncertainty value while discarded others.Finally,it selected the optimal feature subset through the wrapper approach based on C4.5 decision tree.The experimental results on real world Internet traffic data sets show that compared with traditional feature selection approaches,it leads to higher overall accuracy,recall of minority classes and g-mean value,which can reduce the adverse effect caused by multi-class imbalance.%针对网络流量分类中的多类不均衡问题,提出一种基于相对不确定性和对称不确定性的Hybrid型特征选择方法.首先,利用相对不确定性为每个类选择候选特征集;然后,保留每个候选特征集中对称不确定性较高的特征并去除其他特征;最后,利用基于C4.5决策树的wrapper型特征选择方法确定最优特征子集.在真实网络流量数据集上的实验结果表明,与传统方法相比,该方法具有较高的整体准确率、小类召回率和g-mean值,从而可以减轻多类不均衡问题带来的不良影响.
展开▼