Modeling recurring concepts in data streams: a graph-based framework

Ahmadi Zahra; Kramer Stefan

首页> 外文期刊>Knowledge and information systems >Modeling recurring concepts in data streams: a graph-based framework

【24h】

Modeling recurring concepts in data streams: a graph-based framework

机译：在数据流中建模重复概念：基于图形的框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Classifying a stream of non-stationary data with recurrent drift is a challenging task and has been considered as an interesting problem in recent years. All of the existing approaches handling recurrent concepts maintain a pool of concepts/classifiers and use that pool for future classifications to reduce the error on classifying the instances from a recurring concept. However, the number of classifiers in the pool usually grows very fast as the accurate detection of an underlying concept is a challenging task in itself. Thus, there may be many concepts in the pool representing the same underlying concept. This paper proposes the GraphPool framework that refines the pool of concepts by applying a merging mechanism whenever necessary: after receiving a new batch of data, we extract a concept representation from the current batch considering the correlation among features. Then, we compare the current batch representation to the concept representations in the pool using a statistical multivariate likelihood test. If more than one concept is similar to the current batch, all the corresponding concepts will be merged. GraphPool not only keeps the concepts but also maintains the transition among concepts via a first-order Markov chain. The current state is maintained at all times and new instances are predicted based on that. Keeping these transitions helps to quickly recover from drifts in some real-world problems with periodic behavior. Comprehensive experimental results of the framework on synthetic and real-world data show the effectiveness of the framework in terms of performance and pool management.

机译：分类与经常性漂移的非静止数据流是一个具有挑战性的任务，近年来被认为是一个有趣的问题。处理重复概念的所有现有方法维护概念/分类器的池，并使用该池用于将来的分类，以减少对从重复概念进行分类的错误。然而，由于对潜在概念的准确检测本身是一个具有挑战性的任务，因此池中的分类器的数量通常很快。因此，池中可能存在许多代表相同的底层概念的概念。本文提出了通过在必要时应用合并机制来改进概念池的GraphPool框架：在收到新的数据批次之后，我们考虑到特征之间的相关性，从当前批处理中提取概念表示。然后，我们使用统计多变量似然测试将当前批量表示与池中的概念表示进行比较。如果多个概念类似于当前批次，则所有相应的概念都将合并。 GraphPool不仅保留了概念，还通过一阶马尔可夫链保持概念之间的转换。当前状态在所有时间保持维护，基于该时预测新实例。保持这些过渡有助于在周期性行为中快速从一些真实问题的漂移中恢复。综合性和现实世界数据框架的综合实验结果表明了在性能和池管理方面的框架的有效性。

著录项

来源
《Knowledge and information systems》 |2018年第1期|共30页
作者
Ahmadi Zahra; Kramer Stefan;
展开▼
作者单位

Johannes Gutenberg Univ Mainz Inst Informat Mainz Germany;

Johannes Gutenberg Univ Mainz Inst Informat Mainz Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词
Pool management; Recurring concepts; Concept drift; Data stream classification;

机译：池管理;重复概念;概念漂移;数据流分类;

相似文献

外文文献
中文文献
专利

1. Modeling recurring concepts in data streams: a graph-based framework [J] . Ahmadi Zahra, Kramer Stefan Knowledge and information systems . 2018,第1期

机译：在数据流中建模重复概念：基于图形的框架
2. Semi-supervised classification on data streams with recurring concept drift and concept evolution [J] . Zheng Xiulin, Li Peipei, Hu Xuegang, Knowledge-Based Systems . 2021,第Mara5期

机译：关于经常性概念漂移和概念演化的数据流分类分类
3. Tracking Recurring Concepts from Evolving Data Streams using Ensemble Method [J] . Sun Yange, Wang Zhihai, Yuan Jidong, The international arab journal of information technology . 2019,第6期

机译：使用集成方法从不断发展的数据流中跟踪重复出现的概念
4. CPF: Concept Profiling Framework for Recurring Drifts in Data Streams [C] . Robert Anderson, Yun Sing Koh, Gillian Dobbie Australasian joint conference on artificial intelligence . 2016

机译：CPF：用于数据流中反复漂移的概念分析框架
5. Development of a conceptual graph-based information retrieval model for medical question databases. [D] . Huang, Huan. 2004

机译：开发基于概念图的医学问题数据库信息检索模型。
6. Common integration sites of published datasets identified using a graph-based framework [O] . Alessandro Vasciaveo, Ivana Velevska, Gianfranco Politano, 2016

机译：使用基于图的框架识别的已发布数据集的公共集成站点
7. Predicting recurring concepts on data-streams by me ans of a meta-model and a fuzzy similarity function [O] . Abad Arranz Miguel Ángel, Gomes João Bartolo, Menasalvas Ruiz Ernestina 2015

机译：通过元模型和模糊相似度函数预测数据流上的重复概念
8. Framework for Graph-Based Synthesis, Analysis, and Visualization of HPC Cluster Job Data [R] . Brandt, J., De Sapio, V., Gentile, A., 2010

机译：HpC群集作业数据的基于图形的综合，分析和可视化框架

Modeling recurring concepts in data streams: a graph-based framework

摘要

著录项

相似文献

相关主题

期刊订阅