首页> 外文OA文献 >Detecting Outliers under Interval Uncertainty: A New Algorithm Based on Constraint Satisfaction
【2h】

Detecting Outliers under Interval Uncertainty: A New Algorithm Based on Constraint Satisfaction

机译:区间不确定性下检测异常值:一种基于约束满足的新算法

摘要

In many application areas, it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some u22normalu22 values x1,...,xn, compute the sample average E, the sample standard deviation sigma, and then mark a value x as an outlier if x is outside the k0-sigma interval [E-k0*sigma,E+k0*sigma] (for some pre-selected parameter k0). In real life, we often have only interval ranges [xi-,xi+] for the normal values x1,...,xn. In this case, we only have intervals of possible values for the bounds L=E-k0*sigma and U=E+k0*sigma. We can therefore identify outliers as values that are outside all k0-sigma intervals, i.e., values which are outside the interval [L-,U+]. In general, the problem of computing L- and U+ is NP-hard; a polynomial-time algorithm is known for the case when the measurements are sufficiently accurate, i.e., when u22narrowedu22 intervals do not intersect with each other. In this paper, we use constraint satisfaction to show that we can efficiently compute L- and U+ under a weaker (and more general) condition that neither of the narrowed intervals is a proper subinterval of another narrowed interval.
机译:在许多应用领域中,检测异常值很重要。传统的异常值检测工程方法是,我们从一些 u22normal u22值x1,...,xn开始,计算样本平均值E,样本标准偏差sigma,然后将值x标记为离群值(如果x为在k0-sigma间隔[E-k0 * sigma,E + k0 * sigma]之外(对于某些预选参数k0)。在现实生活中,对于正常值x1,...,xn,我们通常只有间隔范围[xi-,xi +]。在这种情况下,对于界限L = E-k0 * sigma和U = E + k0 * sigma,我们只有可能的值区间。因此,我们可以将异常值标识为所有k0-sigma间隔之外的值,即,在间隔[L-,U +]之外的值。通常,计算L-和U +的问题是NP-hard;对于测量足够精确的情况,即,当变窄的间隔彼此不相交时,已知多项式时间算法。在本文中,我们使用约束满足来表明,我们可以在较弱(且更一般)的条件下有效地计算L-和U +,因为这两个缩小间隔都不是另一个缩小间隔的适当子间隔。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号