An Empirical Study of the Classification Performance of Learners on Imbalanced and Noisy Software Quality Data

机译：基于不平衡嘈杂软件质量数据的学习者分类绩效的实证研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the domain of software quality classification, data mining techniques are used to construct models (learners) for identifying software modules that are most likely to be fault-prone. The performance of these models, however, can be negatively affected by class imbalance and noise. Data sampling techniques have been proposed to alleviate the problem of class imbalance, but the impact of data quality on these techniques has not been adequately addressed. We examine the combined effects of noise and imbalance on classification performance when seven commonly-used sampling techniques are applied to software quality measurement data. Our results show that some sampling techniques are more robust in the presence of noise than others. Further, sampling techniques are affected by noise differently given different levels of imbalance.

机译：在软件质量分类的领域中，数据挖掘技术用于构建模型（学习器），以识别最可能出现故障的软件模块。但是，这些模型的性能会受到类别不平衡和噪声的负面影响。已经提出了数据采样技术来减轻类不平衡的问题，但是尚未充分解决数据质量对这些技术的影响。当将七种常用采样技术应用于软件质量测量数据时，我们研究了噪声和不平衡对分类性能的综合影响。我们的结果表明，某些采样技术在存在噪声的情况下比其他采样技术更可靠。此外，在不平衡程度不同的情况下，采样技术受噪声的影响也不同。

著录项

来源
《Information Reuse and Integration, 2007 IEEE International Conference on》|1979年|P.651-658|共8页
会议地点 Kent(GB)
作者
Seiffert Chris; Khoshgoftaar Taghi M.; Van Hulse Jason; Folleco Andres;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
入库时间 2022-08-26 14:03:20

相似文献

外文文献
中文文献
专利

1. An empirical study of the classification performance of learners on imbalanced and noisy software quality data [J] . Chris Seiffert, Taghi M. Khoshgoftaar, Jason Van Hulse, Information Sciences: An International Journal . 2014,第Null期

机译：基于不平衡且嘈杂的软件质量数据的学习者分类表现的实证研究
2. An empirical study of software change classification with imbalance data-handling methods [J] . Zhu Xiaoyan, Niu Binbin, Whitehead E. James Jr., Software . 2018,第11期

机译：不平衡数据处理方法对软件变更分类的实证研究
3. An empirical study on predictability of software maintainability using imbalanced data [J] . Malhotra Ruchika, Lata Kusum Software Quality Journal . 2020,第4期

机译：使用不平衡数据的软件可维护性可预测性的实证研究
4. An Empirical Study of the Classification Performance of Learners on Imbalanced and Noisy Software Quality Data [C] . Seiffert, Chris, Khoshgoftaar, . 2007

机译：基于不平衡嘈杂软件质量数据的学习者分类绩效的实证研究
5. Partitioning filter approach to noise elimination: An empirical study in software quality classification. [D] . Rebours, Pierre. 2004

机译：划分滤波器消除噪声的方法：软件质量分类的经验研究。
6. Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets [O] . Federica Martina, Marco Beccuti, Gianfranco Balbo, -1

机译：特殊基因选择：一种新的特征选择方法可改善不平衡数据集中的分类性能
7. Empirical Assessment of Performance Measures for Preprocessing Moments in Imbalanced Data Classification Problem [O] . Szeszko, Paweł, Topczewska, Magdalena 2016

机译：不平衡数据分类问题中预处理矩性能度量的实证评估

An Empirical Study of the Classification Performance of Learners on Imbalanced and Noisy Software Quality Data

摘要

著录项

相似文献

相关主题

期刊订阅