首页> 外文OA文献 >Autonomous classification models in ubiquitous environments
【2h】

Autonomous classification models in ubiquitous environments

机译:普适环境中的自主分类模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

Stream-mining approach is defined as a set of cutting-edge techniques designed to process streams of data in real time, in order to extract knowledge. In the particular case of classification, stream-mining has to adapt its behaviour to the volatile underlying data distributions, what has been called concept drift. Moreover, it is important to note that concept drift may lead to situations where predictive models become invalid and have therefore to be updated to represent the actual concepts that data poses. In this context, there is a specific type of concept drift, known as recurrent concept drift, where the concepts represented by data have already appeared in the past. In those cases the learning process could be saved or at least minimized by applying a previously trained model. This could be extremely useful in ubiquitous environments that are characterized by the existence of resource constrained devices. To deal with the aforementioned scenario, meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems (IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. By means of using meta-models as a recurrent drift detection mechanism, the ability to share concepts representations among different data mining processes is open. That kind of exchanges could improve the accuracy of the resultant local model as such model may benefit from patterns similar to the local concept that were observed in other scenarios, but not yet locally. This would also improve the efficiency of training instances used during the classification process, as long as the exchange of models would aid in the application of already trained recurrent models, that have been previously seen by any of the collaborative devices. Which it is to say that the scope of recurrence detection and representation is broaden. In fact the detection, representation and exchange of concept drift patterns would be extremely useful for the law enforcement activities fighting against cyber crime. Being the information exchange one of the main pillars of cooperation, national units would benefit from the experience and knowledge gained by third parties. Moreover, in the specific scope of critical infrastructures protection it is crucial to count with information exchange mechanisms, both from a strategical and technical scope. The exchange of concept drift detection schemes in cyber security environments would aid in the process of preventing, detecting and effectively responding to threads in cyber space. Furthermore, as a complement of meta-models, a mechanism to assess the similarity between classification models is also needed when dealing with recurrent concepts. In this context, when reusing a previously trained model a rough comparison between concepts is usually made, applying boolean logic. The introduction of fuzzy logic comparisons between models could lead to a better efficient reuse of previously seen concepts, by applying not just equal models, but also similar ones. This work faces the aforementioned open issues by means of: the MMPRec system, that integrates a meta-model mechanism and a fuzzy similarity function; a collaborative environment to share meta-models between different devices; a recurrent drift generator that allows to test the usefulness of recurrent drift systems, as it is the case of MMPRec. Moreover, this thesis presents an experimental validation of the proposed contributions using synthetic and real datasets.
机译:流挖掘方法被定义为一组旨在实时处理数据流以提取知识的前沿技术。在分类的特定情况下,流挖掘必须使其行为适应易变的基础数据分布,这就是所谓的概念漂移。此外,重要的是要注意,概念漂移可能导致预测模型无效的情况,因此必须进行更新以表示数据构成的实际概念。在这种情况下,存在一种特定类型的概念漂移,称为循环概念漂移,其中以数据表示的概念已在过去出现。在那些情况下,可以通过应用先前训练的模型来保存或至少最小化学习过程。这在以资源受限的设备为特征的无处不在的环境中可能非常有用。为了应对上述情况,可以通过表示和预测何时发生更改来增强数据流算法使用的漂移检测机制的过程中使用元模型。在现实世界中,某些概念会再次出现,例如入侵检测系统(IDS),通常随着时间的推移会再次出现相同的事件或对事件的适应。在这些环境中,通过对过去的模型有更好的了解,可以对漂移进行早期预测,从而有助于预测变化,从而提高所需训练实例的模型效率。通过使用元模型作为循环漂移检测机制,可以在不同的数据挖掘过程之间共享概念表示。这种交换可以提高所得局部模型的准确性,因为这种模型可能会受益于其他场景中观察到的与局部概念相似的模式,但还不是局部的。只要模型的交换将有助于应用任何协作设备以前已经看到的已经训练的循环模型,这也将提高分类过程中使用的训练实例的效率。可以说复发检测和表示的范围扩大了。实际上,概念漂移模式的检测,表示和交换对于打击网络犯罪的执法活动非常有用。作为信息交换的主要合作支柱之一,国家机构将受益于第三方的经验和知识。此外,在关键基础设施保护的特定范围内,至关重要的是要从战略和技术范围来考虑信息交换机制。网络安全环境中概念漂移检测方案的交换将有助于预防,检测和有效响应网络空间中的线程的过程。此外,作为元模型的补充,在处理循环概念时,还需要一种评估分类模型之间相似性的机制。在这种情况下,当重用以前训练有素的模型时,通常使用布尔逻辑对概念进行粗略的比较。通过在模型之间引入模糊逻辑比较,不仅可以应用相等的模型,还可以应用相似的模型,从而可以更好地重用以前看到的概念。这项工作通过以下方式面对上述开放性问题:MMPRec系统,该系统集成了元模型机制和模糊相似性函数;一个在不同设备之间共享元模型的协作环境;循环漂移发生器,可以测试循环漂移系统的有效性,例如MMPRec。此外,本文提出了使用合成和真实数据集对所提出的贡献进行实验验证。

著录项

  • 作者

    Abad Arranz Miguel Angel;

  • 作者单位
  • 年度 2015
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号