A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

Fong Simon; Biuk-Aghai Robert P.; Si Yain-whar; Yap Bee Wah

首页> 外文期刊>Mathematical Problems in Engineering >A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

【24h】

A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

机译：带有快速矛盾分析的轻量级数据预处理策略，用于增量分类器学习

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A prime objective in constructing data streaming mining models is to achieve good accuracy, fast learning, and robustness to noise. Although many techniques have been proposed in the past, efforts to improve the accuracy of classification models have been somewhat disparate. These techniques include, but are not limited to, feature selection, dimensionality reduction, and the removal of noise from training data. One limitation common to all of these techniques is the assumption that the full training dataset must be applied. Although this has been effective for traditional batch training, it may not be practical for incremental classifier learning, also known as data stream mining, where only a single pass of the data stream is seen at a time. Because data streams can amount to infinity and the so-called big data phenomenon, the data preprocessing time must be kept to a minimum. This paper introduces a new data preprocessing strategy suitable for the progressive purging of noisy data from the training dataset without the need to process the whole dataset at one time. This strategy is shown via a computer simulation to provide the significant benefit of allowing for the dynamic removal of bad records from the incremental classifier learning process.

机译：构建数据流挖掘模型的主要目的是获得良好的准确性，快速学习和抗噪声能力。尽管过去已经提出了许多技术，但是在提高分类模型的准确性方面的努力却有所不同。这些技术包括但不限于特征选择，降维和从训练数据中去除噪声。所有这些技术共有的局限性是假设必须应用完整的训练数据集。尽管这对于传统的批处理训练是有效的，但对于增量分类器学习（也称为数据流挖掘）来说可能不切实际，在这种学习中，一次只能看到一次数据流。由于数据流可能达到无穷大，即所谓的大数据现象，因此必须将数据预处理时间保持为最少。本文介绍了一种新的数据预处理策略，该策略适用于从训练数据集中逐步清除噪声数据，而无需一次处理整个数据集。通过计算机仿真显示了此策略，该策略可提供显着的好处，即可以从增量分类器学习过程中动态删除不良记录。

著录项

来源
《Mathematical Problems in Engineering》 |2015年第3期|125781.1-125781.11|共11页
作者
Fong Simon; Biuk-Aghai Robert P.; Si Yain-whar; Yap Bee Wah;
展开▼
作者单位

Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam 40450, Selangor, Malaysia.;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning [J] . SimonFong, Robert P.Biuk-Aghai, Yain-wharSi, Mathematical Problems in Engineering: Theory, Methods and Applications . 2015,第5期

机译：具有快速矛盾分析的轻量级数据预处理策略，用于增量分类器学习
2. A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning [J] . SimonFong, Robert P.Biuk-Aghai, Yain-wharSi, Mathematical Problems in Engineering: Theory, Methods and Applications . 2015,第3期

机译：具有快速矛盾分析的轻量级数据预处理策略，用于增量分类器学习
3. Prediction of secondary testosterone deficiency using machine learning: A comparative analysis of ensemble and base classifiers, probability calibration, and sampling strategies in a slightly imbalanced dataset [J] . Monique Tonani Novaes, Osmar Luiz Ferreira de Carvalho, Pedro Henrique Guimar?es Ferreira, Informatics in Medicine Unlocked . 2021,第a期

机译：使用机器学习预测次级睾酮缺乏：略微不平衡数据集中的集合和基础分类器，概率校准和采样策略的比较分析
4. Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy [C] . Sang-Woo Lee, Min-Oh Heo, Jiwon Kim, International Conference on Machine Learning . 2015

机译：双内存架构通过在线增量传输策略快速深入学习流数据
5. Creating fast and accurate machine learning ensembles through training dataset preprocessing. [D] . Whitehead, Matthew E. N. 2010

机译：通过训练数据集预处理创建快速而准确的机器学习集合。
6. Streaming chunk incremental learning for class-wise data stream classification with fast learning speed and low structural complexity [O] . Prem Junsawang, Suphakant Phimoltares, Chidchanok Lursinsap 2012

机译：流式块增量学习，用于以快速的学习速度和较低的结构复杂度对类数据流进行分类
7. Astronomical Data Preprocessing Implementation Based on FPGA and Data Transformation Strategy for the FAST Telescope as a Giant CPS [O] . Yuefeng Song, Yongxin Zhu, Junjie Hou, 2020

机译：基于FPGA和数据变换策略的天文数据预处理实现快速望远镜作为巨型CPS

A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅