首页> 外文期刊>International journal of software engineering and knowledge engineering >Efficient Mining of Data Streams Using Associative Classification Approach
【24h】

Efficient Mining of Data Streams Using Associative Classification Approach

机译:使用关联分类法高效挖掘数据流

获取原文
获取原文并翻译 | 示例
           

摘要

Received 10 January 2014 Revised 23 April 2014 Accepted 6 August 2014 Data stream associative classification poses many challenges to the data mining community. In this paper, we address four major challenges posed, namely, infinite length, extraction of knowledge with single scan, processing time, and accuracy. Since data streams are infinite in length, it is impractical to store and use all the historical data for training. Mining such streaming data for knowledge acquisition is a unique opportunity and even a tough task. A streaming algorithm must scan data once and extract knowledge. While mining data streams, processing time, and accuracy have become two important aspects. In this paper, we propose PSTMiner which considers the nature of data streams and provides an efficient classifier for predicting the class label of real data streams. It has greater potential when compared with many existing classification techniques. Additionally, we propose a compact novel tree structure called PSTree (Prefix Streaming Tree) for storing data. Extensive experiments conducted on 24 real datasets from UCI repository and synthetic datasets from MOA (Massive Online Analysis) show that PSTMiner is consistent. Empirical results show that performance of PSTMiner is highly competitive in terms of accuracy and performance time when compared with other approaches under windowed streaming model.
机译:2014年1月10日收到,2014年4月23日修订,2014年8月6日接受。数据流关联分类给数据挖掘界带来了许多挑战。在本文中,我们解决了四个主要挑战,即无限长度,单次扫描提取知识,处理时间和准确性。由于数据流的长度是无限的,因此存储和使用所有历史数据进行训练是不切实际的。挖掘此类流数据以获取知识是一个独特的机会,甚至是艰巨的任务。流算法必须扫描一次数据并提取知识。在挖掘数据流时,处理时间和准确性已成为两个重要方面。在本文中,我们提出了PSTMiner,它考虑了数据流的本质,并为预测实际数据流的类标签提供了有效的分类器。与许多现有分类技术相比,它具有更大的潜力。此外,我们提出了一种紧凑的新颖树结构,称为PSTree(前缀流树),用于存储数据。对来自UCI储存库的24个真实数据集和来自MOA(大规模在线分析)的综合数据集进行的大量实验表明,PSTMiner是一致的。实证结果表明,与窗口流模型下的其他方法相比,PSTMiner的性能在准确性和性能时间方面具有很高的竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号