首页> 外文会议>Italian Symposium on Advanced Database Systems >Detecting and Explaining Exceptional Values in Categorical Data: DISCUSSION PAPER
【24h】

Detecting and Explaining Exceptional Values in Categorical Data: DISCUSSION PAPER

机译:检测和解释分类数据中的异常值:讨论论文

获取原文

摘要

In this work we deal with the problem of detecting and explaining exceptional behaving values in categorical datasets by perceiving an attribute value as anomalous if its frequency occurrence is exceptionally typical or un-typical within the distribution of frequencies occurrences of any other attribute value. The notion of frequency occurrence is provided by specialising the Kernel Density Estimation method to the domain of frequency values and an outlierness measure is defined by leveraging the cdf of such a density. This measure is able to simultaneously identify two kinds of anomalies called lower outliers and upper outliers, namely exceptionally low or high frequent values. Moreover, data values labeled as outliers come with an interpretable explanations for their abnormality, which is a desirable feature of any knowledge discovery technique.
机译:在这项工作中,我们通过将属性值视为异常来处理分类数据集中异常行为值的检测和解释问题,如果属性值的出现频率在任何其他属性值的出现频率分布中异常典型或不典型。频率发生的概念是通过将核密度估计方法专门化到频率值域来提供的,离群度度量是通过利用这种密度的cdf来定义的。这种测量方法能够同时识别两种异常,称为低异常值和高异常值,即异常低或异常高频率值。此外,标记为异常值的数据值为其异常提供了可解释的解释,这是任何知识发现技术的理想特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号