...
首页> 外文期刊>The Journal of Systems and Software >Countering the concept-drift problems in big data by an incrementally optimized stream mining model
【24h】

Countering the concept-drift problems in big data by an incrementally optimized stream mining model

机译:通过逐步优化的流挖掘模型应对大数据中的概念漂移问题

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Mining the potential value hidden behind big data has been a popular research topic around the world. For an infinite big data scenario, the underlying data distribution of newly arrived data may be appeared differently from the old one in the real world. This phenomenon is so-called the concept-drift problem that exists commonly in the scenario of big data mining. In the past decade, decision tree inductions use multi-tree learning to detect the drift using alternative trees as a solution. However, multi-tree algorithms consume more computing resources than the singletree. This paper proposes a singletree with an optimized node-splitting mechanism to detect the drift in a test-then-training tree-building process. In the experiment, we compare the performance of the new method to some state-of-art singletree and multi-tree algorithms. Result shows that the new algorithm performs with good accuracy while a more compact model size and less use of memory than the others.
机译:挖掘隐藏在大数据背后的潜在价值已成为世界范围内的热门研究话题。对于无限的大数据场景,新到达的数据的基础数据分布可能与现实世界中的旧数据看起来有所不同。这种现象被称为概念漂移问题,通常在大数据挖掘场景中存在。在过去的十年中,决策树归纳法使用多树学习技术,以替代树作为解决方案来检测漂移。但是,与单树相比,多树算法消耗更多的计算资源。本文提出了一种具有优化的节点拆分机制的单树,以在“测试-然后-训练”树构建过程中检测漂移。在实验中,我们将新方法与一些最新的单树和多树算法的性能进行了比较。结果表明,与其他算法相比,新算法具有较高的精度,同时模型尺寸更小,内存使用更少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号