A Precise Statistical approach for concept change detection in unlabeled data streams

Niloofar Mozafari; Sattar Hashemi; Ali Hamzeh

首页> 外文期刊>Computers & mathematics with applications >A Precise Statistical approach for concept change detection in unlabeled data streams

【24h】

A Precise Statistical approach for concept change detection in unlabeled data streams

机译：用于未标记数据流中概念更改检测的精确统计方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently data stream has been extensively explored due to its emergence in a great deal of applications such as sensor networks, web click streams and network flows. One of the most important challenges in data streams is concept change where data underlying distributions change from time to time. A vast majority of researches in the context of data stream mining are devoted to labeled data, whereas, in real word human practice label of data are rarely available to the learning algorithms. Moreover, most of the methods that detect changes in unlabeled data stream merely deal with numerical data sets, and also, they are facing considerable difficulty when dimension of data tends to increase. In this paper, we present a Precise Statistical approach for Concept Change Detection in unlabeled data streams, which, abbreviated as PSCCD, detects changes using an exchangeable test. This hypothesis test is driven from a martingale which is based on Doob's Maximal Inequality. The advantages of our approach are three fold. First, it does not require a sliding window on the data stream whose size is a well-known challenging issue; second, it works well in multi-dimensional data stream, and last but not the least, it is applicable to different types of data including categorical, numerical and mixed-attribute data streams. To explore the advantages of our approach, quite a lot of experiments with different settings and specifications are conducted. The obtained results are very promising.

机译：近年来，由于数据流在诸如传感器网络，Web点击流和网络流等大量应用中的出现，已经得到了广泛的探索。数据流中最重要的挑战之一是概念的改变，其中基础分布的数据不时发生变化。数据流挖掘方面的绝大多数研究都致力于标记数据，而实际上，人类实践中的数据标记很少可供学习算法使用。此外，大多数检测未标记数据流中变化的方法仅处理数字数据集，并且，当数据的尺寸趋于增加时，它们面临相当大的困难。在本文中，我们提出了一种用于未标记数据流中概念更改检测的精确统计方法，该方法缩写为PSCCD，使用可交换测试来检测更改。该假设检验来自基于Doob最大不等式的a。我们方法的优点是三方面的。首先，它不需要数据流上的滑动窗口，其大小是众所周知的挑战性问题；其次，它在多维数据流中效果很好，最后但并非最不重要的一点，它适用于不同类型的数据，包括分类，数字和混合属性数据流。为了探索我们方法的优势，我们进行了许多不同设置和规格的实验。获得的结果非常有希望。

著录项

来源
《Computers & mathematics with applications》 |2011年第4期|p.1655-1669|共15页
作者
Niloofar Mozafari; Sattar Hashemi; Ali Hamzeh;
展开▼
作者单位

Department of Computer Science and Engineering and Information Technology, School of Electrical Computer Engineering,Zand Avenue, Shiraz University, Iran;

Department of Computer Science and Engineering and Information Technology, School of Electrical Computer Engineering,Zand Avenue, Shiraz University, Iran;

Department of Computer Science and Engineering and Information Technology, School of Electrical Computer Engineering,Zand Avenue, Shiraz University, Iran;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
data stream; concept change; martingale; exchangeability; hypothesis testing; strangeness measure;

机译：数据流;概念改变;鞅;可交换性假设检验;陌生测度;

相似文献

外文文献
中文文献
专利

1. On the reliable detection of concept drift from streaming unlabeled data [J] . Sethi Tegjyot Singh, Kantardzic Mehmed Expert Systems with Application . 2017,第octa期

机译：关于可靠地检测来自未标记数据流的概念漂移的问题
2. Learning from concept drifting data streams with unlabeled data [J] . Xindong Wu, Peipei Li, Xuegang Hu Neurocomputing . 2012,第期

机译：从带有未标记数据的概念漂移数据流中学习
3. Detection of evolving concepts in non-stationary data streams: A multiple kernel learning approach [J] . Siahroudi Sajjad Kamali, Moodi Poorya Zare, Beigy Hamid Expert Systems with Application . 2018,第jana期

机译：检测非平稳数据流中不断发展的概念：多核学习方法
4. Concept Drift Detection on Unlabeled Data Streams: A Systematic Literature Review [C] . Nur Laila Ab Ghani, Izzatdin Abdul Aziz, Mazlina Mehat IEEE Conference on Big Data and Analytics . 2020

机译：未标记数据流概念漂移检测：系统文献综述
5. Relational discovery in sequentially-connected data streams: Efficient algorithms for lossless pattern discovery and change detection. [D] . Coble, Jeffrey Allen. 2005

机译：顺序连接的数据流中的关系发现：用于无损模式发现和更改检测的高效算法。
6. A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data [O] . Jie Chen, Yu-Ping Wang -1

机译：阵列CGH数据中DNA拷贝数变化检测的统计变化点模型方法
7. A Precise Statistical approach for concept change detection in unlabeled data streams [O] . Mozafari Niloofar, Hashemi Sattar, Hamzeh Ali 2011

机译：用于未标记数据流中概念更改检测的精确统计方法

A Precise Statistical approach for concept change detection in unlabeled data streams

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅