首页> 外文会议>ACM SIGKDD international conference on Knowledge discovery in data mining >Combining proactive and reactive predictions for data streams
【24h】

Combining proactive and reactive predictions for data streams

机译:结合主动和被动的数据流预测

获取原文

摘要

Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (2) the underlying concept of the data may change over time. Different from common practice that keeps recent raw data, this paper uses a measure of conceptual equivalence to organize the data history into a history of concepts. Along the journey of concept change, it identifies new concepts as well as re-appearing ones, and learns transition patterns among concepts to help prediction. Different from conventional methodology that passively waits until the concept changes, this paper incorporates proactive and reactive predictions. In a proactive mode, it anticipates what the new concept will be if a future concept change takes place, and prepares prediction strategies in advance. If the anticipation turns out to be correct, a proper prediction model can be launched instantly upon the concept change. If not, it promptly resorts to a reactive mode: adapting a prediction model to the new data. A system RePro is proposed to implement these new ideas. Experiments compare the system with representative existing prediction methods on various benchmark data sets that represent diversified scenarios of concept change. Empirical evidence demonstrates that the proposed methodology is an effective and efficient solution to prediction for data streams.
机译:挖掘数据流在科学和商业中都非常重要。两个主要挑战是:(1)数据可能会无限增长,因此很难保留很长的历史记录; (2)数据的基本概念可能会随着时间而改变。与保留最新原始数据的常规做法不同,本文使用概念等效性的度量将数据历史组织为概念的历史。在概念变化的整个过程中,它可以识别新概念以及重新出现的新概念,并学习概念之间的过渡模式以帮助进行预测。与传统的被动等待概念改变的方法不同,本文结合了 proactive reactive 预测。在 proactive (主动)模式下,它可以预测将来发生概念更改时新概念将是什么,并预先准备预测策略。如果预期结果是正确的,则可以在概念改变时立即启动适当的预测模型。如果不是,它将立即采用反应性模式:将预测模型适应新数据。提出了一个 RePro 系统来实现这些新思想。实验将系统与代表各种概念变更场景的各种基准数据集上具有代表性的现有预测方法进行了比较。经验证据表明,所提出的方法是一种有效且高效的解决方案,用于预测数据流。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号