...
首页> 外文期刊>Mathematical Problems in Engineering >A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning
【24h】

A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

机译:带有快速矛盾分析的轻量级数据预处理策略,用于增量分类器学习

获取原文
获取原文并翻译 | 示例
           

摘要

A prime objective in constructing data streaming mining models is to achieve good accuracy, fast learning, and robustness to noise. Although many techniques have been proposed in the past, efforts to improve the accuracy of classification models have been somewhat disparate. These techniques include, but are not limited to, feature selection, dimensionality reduction, and the removal of noise from training data. One limitation common to all of these techniques is the assumption that the full training dataset must be applied. Although this has been effective for traditional batch training, it may not be practical for incremental classifier learning, also known as data stream mining, where only a single pass of the data stream is seen at a time. Because data streams can amount to infinity and the so-called big data phenomenon, the data preprocessing time must be kept to a minimum. This paper introduces a new data preprocessing strategy suitable for the progressive purging of noisy data from the training dataset without the need to process the whole dataset at one time. This strategy is shown via a computer simulation to provide the significant benefit of allowing for the dynamic removal of bad records from the incremental classifier learning process.
机译:构建数据流挖掘模型的主要目的是获得良好的准确性,快速学习和抗噪声能力。尽管过去已经提出了许多技术,但是在提高分类模型的准确性方面的努力却有所不同。这些技术包括但不限于特征选择,降维和从训练数据中去除噪声。所有这些技术共有的局限性是假设必须应用完整的训练数据集。尽管这对于传统的批处理训练是有效的,但对于增量分类器学习(也称为数据流挖掘)来说可能不切实际,在这种学习中,一次只能看到一次数据流。由于数据流可能达到无穷大,即所谓的大数据现象,因此必须将数据预处理时间保持为最少。本文介绍了一种新的数据预处理策略,该策略适用于从训练数据集中逐步清除噪声数据,而无需一次处理整个数据集。通过计算机仿真显示了此策略,该策略可提供显着的好处,即可以从增量分类器学习过程中动态删除不良记录。

著录项

  • 来源
    《Mathematical Problems in Engineering》 |2015年第3期|125781.1-125781.11|共11页
  • 作者单位

    Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

    Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

    Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China.;

    Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam 40450, Selangor, Malaysia.;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号