首页> 外文会议>International Conference on Data Engineering >LOCUST: An Online Analytical Processing Framework for High Dimensional Classification of Data Streams
【24h】

LOCUST: An Online Analytical Processing Framework for High Dimensional Classification of Data Streams

机译:蝗虫:用于数据流的高维分类的在线分析处理框架

获取原文

摘要

In recent years, data streams have become ubiquitous because of advances in hardware and software technology. The ability to adapt conventional mining problems to data streams is a great challenge in a data stream environment. Many data streams are inherently high dimensional, which creates a special challenge for data mining algorithms. In this paper, we consider the problem of classification of high dimensional data streams. For the high dimensional case, even traditional classifiers do not work very well on fixed data sets. We discuss a number of insights for the intractability of the high dimensional case. We use these insights to propose a new classification method (LOCUST) which avoids many of these weaknesses. The key is to develop a subspace-based instance centered classification approach which can be implemented efficiently for a fast data stream. We propose a methodology to effectively process the data stream in an organized way, so that the intermediate data structures can be used to sample locally discriminative subspaces for the classification process. We show that LOCUST is able to work effectively in the high dimensional case, and is also flexible in terms of increased robustness with greater resource availability.
机译:近年来,由于硬件和软件技术的进步,数据流已经变得无处不在。将传统挖掘问题适应数据流的能力是数据流环境中的巨大挑战。许多数据流是固有的高维度,这为数据挖掘算法创造了一个特别的挑战。在本文中,我们考虑了高维数据流分类问题。对于高维情况,即使是传统的分类器也不适用于固定数据集。我们讨论了对高尺寸案例的难以放电的一些见解。我们使用这些见解来提出一种新的分类方法(蝗虫),避免了许多这些弱点。关键是开发一种基于子空间的实例,其分类方法可以有效地实现快速数据流。我们提出了一种方法,以便以有组织的方式有效地处理数据流,使得中间数据结构可用于对分类过程进行局部判别子空间来采样局部辨别子空间。我们表明蝗虫能够有效地在高维情况下工作,并且在具有更高资源可用性的鲁棒性方面也是灵活的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号