Benchmarking binary classification models on data sets with different degrees of imbalance

Ligang ZHOU; Kin Keung LAI

首页> 中文期刊> 《中国高等学校学术文摘·计算机科学》 >Benchmarking binary classification models on data sets with different degrees of imbalance

Benchmarking binary classification models on data sets with different degrees of imbalance

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In practice, there are many binary classification problems, such as credit risk assessment, medical testing for determining if a patient has a certain disease or not, etc.However, different problems have different characteristics that may lead to different difficulties of the problem. One important characteristic is the degree of imbalance of two classes in data sets. For data sets with different degrees of imbalance, fire the commonly used binary classification methods still feasible? In this study, various binary classifi-cation models, including traditional statistical methods andnewly emerged methods from artificial intelligence, such as linear regression, discriminant analysis, decision tree, neural network, support vector machines, etc., are reviewed, and their performance in terms of the measure of classification accuracy and area under Receiver Operating Characteristic (ROC) curve are tested and compared on fourteen data sets with different imbalance degrees. The results help to select the appropriate methods for problems with different degrees of imbalance.

著录项

来源
《中国高等学校学术文摘·计算机科学》 |2009年第2期|205-216|共12页
作者
Ligang ZHOU; Kin Keung LAI;
展开▼
作者单位

Department of Management Sciences,City University of Hong Kong,Hong Kong,China;

Department of Management Sciences,City University of Hong Kong,Hong Kong,China;

展开▼
原文格式 PDF
正文语种 chi
中图分类计算技术、计算机技术;
关键词

相似文献

中文文献
外文文献
专利

1. A Classification Method of Imbalanced Data Base on PSO Algorithm [J] . Junru Lu1 ,Chunkai Zhang1 ,Fengxing Shi1 . 国际计算机前沿大会会议论文集 . 2016,第002期
2. An Improved Algorithm for Imbalanced Data and Small Sample Size Classification [J] . Yong Hu ,Dongfa Guo ,Zengwei Fan . 数据分析和信息处理（英文） . 2015,第3期
3. Multilevel Modeling of Binary Outcomes with Three-Level Complex Health Survey Data [J] . Shafquat Rozi ,Sadia Mahmud ,Gillian Lancaster . 流行病学期刊（英文） . 2017,第1期
4. Data-driven Integrated Intelligent Modeling of Rotary Kiln Pelletizing Process Based on Rough sets [C] . Jie-sheng Wang ,Yong Zhang . 第三届全国社会计算会议、平行控制会议、平行管理会议 . 2011
5. Imbalanced Big Data classification Algorithm Based on Spark and Hash Technology [A] . Yayun Wang . 2019

Benchmarking binary classification models on data sets with different degrees of imbalance

摘要

著录项

相似文献

相关主题

期刊订阅