A learning method for the class imbalance problem with medical data sets.

Li DC; Liu CW; Hu SC

首页> 外文期刊>Computers in Biology and Medicine >A learning method for the class imbalance problem with medical data sets.

【24h】

A learning method for the class imbalance problem with medical data sets.

机译：具有医学数据集的班级不平衡问题的学习方法。

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In medical data sets, data are predominately composed of "normal" samples with only a small percentage of "abnormal" ones, leading to the so-called class imbalance problems. In class imbalance problems, inputting all the data into the classifier to build up the learning model will usually lead a learning bias to the majority class. To deal with this, this paper uses a strategy which over-samples the minority class and under-samples the majority one to balance the data sets. For the majority class, this paper builds up the Gaussian type fuzzy membership function and alpha-cut to reduce the data size; for the minority class, we use the mega-trend diffusion membership function to generate virtual samples for the class. Furthermore, after balancing the data size of classes, this paper extends the data attribute dimension into a higher dimension space using classification related information to enhance the classification accuracy. Two medical data sets, Pima Indians' diabetes and the BUPA liver disorders, are employed to illustrate the approach presented in this paper. The results indicate that the proposed method has better classification performance than SVM, C4.5 decision tree and two other studies.

机译：在医学数据集中，数据主要由“正常”样本组成，而只有很少一部分“异常”样本，从而导致所谓的类别失衡问题。在班级不平衡问题中，将所有数据输入到分类器中以建立学习模型通常会导致大多数班级的学习偏见。为了解决这个问题，本文采用了一种策略：对少数群体进行过度采样，对少数群体进行过度采样，以平衡数据集。对于多数类，本文建立了高斯型模糊隶属度函数和alpha割以减小数据量。对于少数类，我们使用大趋势扩散隶属函数生成该类的虚拟样本。此外，在平衡类的数据大小之后，本文使用与分类有关的信息将数据属性维度扩展到更高维度的空间，以提高分类的准确性。两种医学数据集，即比马印第安人的糖尿病和BUPA肝病，被用来说明本文提出的方法。结果表明，与SVM，C4.5决策树和其他两项研究相比，该方法具有更好的分类性能。

著录项

来源
《Computers in Biology and Medicine》 |2010年第5期|共10页
作者
Li DC; Liu CW; Hu SC;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类医用一般科学;
关键词

相似文献

外文文献
中文文献
专利

1. A learning method for the class imbalance problem with medical data sets. [J] . Li DC, Liu CW, Hu SC Computers in Biology and Medicine . 2010,第5期

机译：具有医学数据集的班级不平衡问题的学习方法。
2. Handling imbalanced medical image data: A deep-learning-based one-class classification approach [J] . Gao Long, Zhang Lei, Liu Chang, Artificial intelligence in medicine . 2020,第Auga期

机译：处理不平衡的医学图像数据：基于深度学习的单级分类方法
3. Robust multiclass classification for learning from imbalanced biomedical data [J] . Phoungphol Piyaphol, Zhang Yanqing, Zhao Yichuan Tsinghua Science and Technology . 2012,第6期

机译：稳健的多类分类，可从不平衡的生物医学数据中学习
4. Obesity Entity Extraction from Real Outpatient Records: When Learning-Based Methods Meet Small Imbalanced Medical Data Sets [C] . Yihan Deng, Peter Dolog, Jörn-Markus Gass, IEEE International Symposium on Computer-Based Medical Systems . 2019

机译：从真实的门诊记录中提取肥胖实体：当基于学习的方法遇到较小的不平衡医疗数据集时
5. Machine Learning Methods for High-Dimensional Imbalanced Biomedical Data. [D] . Yang, Tao. 2013

机译：高维不平衡生物医学数据的机器学习方法。
6. Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data [O] . Bomi Jeong, Hyunjeong Cho, Jieun Kim, 2020

机译：高度不平衡的多类肾脏数据分类的统计模型与机器学习方法的比较
7. Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets. [O] . Federica Martina, Marco Beccuti, Gianfranco Balbo, 2017

机译：特殊基因选择：一种改进不平衡数据集分类性能的新特征选择方法。
8. Methods to Address Extreme Class Imbalance in Machine Learning Based Network Intrusion Detection Systems. [R] . Walter, R. W. 2016

机译：解决基于机器学习的网络入侵检测系统中极端类不平衡的方法。

A learning method for the class imbalance problem with medical data sets.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅