通过具体应用实例,指出目前普遍使用的正确率和错误率评价指标在不平衡数据集、语义相关多分、不同错分代价等分类问题中评价分类器性能时存在的缺陷.为了解决这一问题,根据具体问题的不同,提出了综合使用查准率、查全率、漏检率、误检率、F-measure和分类代价矩阵、损失函数等新的分类器性能评价指标.通过实验证明,新的分类评价指标确实能很好的适应不平衡数据集、语义相关多分、不同错分代价等分类问题的分类器性能评价.%This paper analyzed the current widely used measure identification of classifier performance,accuracy and error rates. However, on unbalanced data set, semantic-related multi-class, different costs for different misclassification type and other classification application problems, there are many defects when accuracy and error rates are used to appraisal the classifier performance. In order to solve the above problems, precision, recall, mistake, omitting F-measure ratio and classification cost matrix, loss function are integrated to measure the performance of classifier based on different applications.Experiments on unbalanced data set, semantic-related multi-class, different costs for different misclassification type classification application problems show new indexes can appraisal classifier performance very well in the above problems.
展开▼