...
首页> 外文期刊>Complexity >Application of the Variable Precision Rough Sets Model to Estimate the Outlier Probability of Each Element
【24h】

Application of the Variable Precision Rough Sets Model to Estimate the Outlier Probability of Each Element

机译:可变精度粗糙集模型的应用来估计每个元素的异常概率

获取原文
获取原文并翻译 | 示例
           

摘要

In a data mining process, outlier detection aims to use the high marginality of these elements to identify them by measuring their degree of deviation from representative patterns, thereby yielding relevant knowledge. Whereas rough sets (RS) theory has been applied to the field of knowledge discovery in databases (KDD) since its formulation in the 1980s; in recent years, outlier detection has been increasingly regarded as a KDD process with its own usefulness. The application of RS theory as a basis to characterise and detect outliers is a novel approach with great theoretical relevance and practical applicability. However, algorithms whose spatial and temporal complexity allows their application to realistic scenarios involving vast amounts of data and requiring very fast responses are difficult to develop. This study presents a theoretical framework based on a generalisation of RS theory, termed the variable precision rough sets model (VPRS), which allows the establishment of a stochastic approach to solving the problem of assessing whether a given element is an outlier within a specific universe of data. An algorithm derived from quasi-linearisation is developed based on this theoretical framework, thus enabling its application to large volumes of data. The experiments conducted demonstrate the feasibility of the proposed algorithm, whose usefulness is contextualised by comparison to different algorithms analysed in the literature.
机译:在数据挖掘过程中,异常值检测旨在使用这些元素的高边距来通过测量与代表性模式的偏差程度来识别它们,从而产生相关的知识。粗糙集(RS)理论已应用于20世纪80年代的制定以来数据库(KDD)中的知识发现领域;近年来,异常值越来越多地被视为具有自己的有用性的KDD进程。 RS理论的应用作为表征和检测异常值的基础是一种具有良好理论相关性和实际适用性的新方法。然而,其空间和时间复杂性的算法允许其应用于涉及大量数据的现实场景,并且需要非常快速的响应难以开发。本研究提出了一种基于RS理论的概括的理论框架,称为可变精度粗糙集模型(VPRS),其允许建立一个随机方法来解决评估给定元素是否是特定宇宙内的异常值数据的。基于此理论框架开发了一种衍生自用于准线性化的算法,从而使其应用于大量数据。进行的实验证明了所提出的算法的可行性,其有用性是通过在文献中分析的不同算法的比较而设计的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号