Countering the concept-drift problems in big data by an incrementally optimized stream mining model

Hang Yang; Simon Fong

首页> 外文期刊>The Journal of Systems and Software >Countering the concept-drift problems in big data by an incrementally optimized stream mining model

【24h】

Countering the concept-drift problems in big data by an incrementally optimized stream mining model

机译：通过逐步优化的流挖掘模型应对大数据中的概念漂移问题

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Mining the potential value hidden behind big data has been a popular research topic around the world. For an infinite big data scenario, the underlying data distribution of newly arrived data may be appeared differently from the old one in the real world. This phenomenon is so-called the concept-drift problem that exists commonly in the scenario of big data mining. In the past decade, decision tree inductions use multi-tree learning to detect the drift using alternative trees as a solution. However, multi-tree algorithms consume more computing resources than the singletree. This paper proposes a singletree with an optimized node-splitting mechanism to detect the drift in a test-then-training tree-building process. In the experiment, we compare the performance of the new method to some state-of-art singletree and multi-tree algorithms. Result shows that the new algorithm performs with good accuracy while a more compact model size and less use of memory than the others.

机译：挖掘隐藏在大数据背后的潜在价值已成为世界范围内的热门研究话题。对于无限的大数据场景，新到达的数据的基础数据分布可能与现实世界中的旧数据看起来有所不同。这种现象被称为概念漂移问题，通常在大数据挖掘场景中存在。在过去的十年中，决策树归纳法使用多树学习技术，以替代树作为解决方案来检测漂移。但是，与单树相比，多树算法消耗更多的计算资源。本文提出了一种具有优化的节点拆分机制的单树，以在“测试-然后-训练”树构建过程中检测漂移。在实验中，我们将新方法与一些最新的单树和多树算法的性能进行了比较。结果表明，与其他算法相比，新算法具有较高的精度，同时模型尺寸更小，内存使用更少。

著录项

来源
《The Journal of Systems and Software》 |2015年第4期|158-166|共9页
作者
Hang Yang; Simon Fong;
展开▼
作者单位

Electric Power Research Institute, China Southern Power Grid, China;

Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Concept drift; Data stream mining; Very fast decision tree;

机译：概念漂移;数据流挖掘;快速决策树;

相似文献

外文文献
中文文献
专利

1. Countering the concept-drift problems in big data by an incrementally optimized stream mining model [J] . Massimiliano Masi Computing reviews . 2016,第2期

机译：通过逐步优化的流挖掘模型应对大数据中的概念漂移问题
2. Incremental Optimization Mechanism for Constructing a Decision Tree in Data Stream Mining [J] . Hang Yang, Simon Fong Mathematical Problems in Engineering . 2013,第pta2期

机译：数据流挖掘中构建决策树的增量优化机制
3. Incremental Optimization Mechanism for Constructing a Decision Tree in Data Stream Mining [J] . HangYang, SimonFong Mathematical Problems in Engineering: Theory, Methods and Applications . 2013,第5期

机译：数据流挖掘中构建决策树的增量优化机制
4. Incremental Mining of Across-streams Sequential Patterns in Multiple Data Streams [C] . Ching-Ming Chao, Yan-Ting Lin 25th international conference on computers and their applications 2010 . 2010

机译：多个数据流中跨流顺序模式的增量挖掘
5. Efficient Incremental Model Learning on Data Streams [D] . Chen, Xilun. 2019

机译：高效增量模型在数据流上学习
6. Designing a Streaming Algorithm for Outlier Detection in Data Mining—An Incremental Approach [O] . Kangqing Yu, Wei Shi, Nicola Santoro 2020

机译：设计用于数据挖掘中异常值检测的流算法—一种增量方法
7. Incremental Aspect Models for Mining Document Streams [O] . Surendran, A., Sra, S. 2006

机译：挖掘文档流的增量方面模型

Countering the concept-drift problems in big data by an incrementally optimized stream mining model

摘要

著录项

相似文献

相关主题

期刊订阅