Optimal outlier removal in high-dimensional spaces

Dunagan J.; Vempala S.

首页> 外文期刊>Journal of computer and system sciences >Optimal outlier removal in high-dimensional spaces

【24h】

Optimal outlier removal in high-dimensional spaces

机译：高维空间中的最佳离群值去除

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the problem of finding an outlier-free subset of a set of points (or a probability distribution) in n-dimensional Euclidean space. As in [BFKV 99], a point x is defined to be a beta-outlier if there exists some direction it; in which its squared distance from the mean along w is greater than beta times the average squared distance from the mean along w. Our main theorem is that for any epsilon > 0, there exists a (1 - epsilon) fraction of the original distribution that has no O((n)/(epsilon)(b + log(n)/(epsilon)))-outliers, improving on the previous bound of 0(n(7) b/e). This is asymptotically the best possible, as shown by a matching lower bound. The theorem is constructive, and results in a (1)/(1-epsilon) approximation to the following optimization problem: given a distribution mu(i.e. the ability to sample from it), and a parameter epsilon > 0, find the minimum beta for which there exists a subset of probability at least (1 - epsilon) with no beta-outliers. (C) 2003 Elsevier Inc. All rights reserved. [References: 8]

机译：我们研究在n维欧几里得空间中找到一组点（或概率分布）的无离群子集的问题。与[BFKV 99]中一样，如果存在某个方向，则将点x定义为beta异常点；其中它与沿w的平均值的平方距离大于beta乘以与w的平均值的平均平方距离。我们的主要定理是，对于任何大于0的epsilon，原始分布中都有一个（1-epsilon）分数，而没有O（（n）/（epsilon）（b + log（n）/（epsilon）））-离群值，改进了先前的0（n（7）b / e）范围。渐近地，这是最佳的，如匹配的下限所示。该定理是构造性的，并且导致以下优化问题的（1）/（1-ε）近似值：给定一个分布μu（即从中采样的能力），并且参数ε≥0，找到最小β因此，存在至少（1- epsilon）的概率子集，没有beta异常值。（C）2003 Elsevier Inc.保留所有权利。 [参考：8]

著录项

来源
《Journal of computer and system sciences》 |2004年第2期|p. 335-373|共39页
作者
Dunagan J.; Vempala S.;
展开▼
作者单位

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Convex-bodies; Algorithm;

机译：凸体;算法;

相似文献

外文文献
中文文献
专利

1. Outlier Detection Using Structural Scores in a High-Dimensional Space [J] . Li Xiaojie, Lv Jiancheng, Yi Zhang Cybernetics, IEEE Transactions on . 2020,第5期

机译：远离高维空间中的结构分数的异常检测
2. Random Subspace Learning Approach to High-Dimensional Outliers Detection [J] . Bohan Liu, Ernest Fokoué Open Journal of Statistics . 2015,第6期

机译：高维离群值检测的随机子空间学习方法
3. Finding key knowledge attribute subspace of outliers in high-dimensional dataset [J] . Biao Huang, Peng Yang Expert systems with applications . 2011,第8期

机译：在高维数据集中寻找离群值的关键知识属性子空间
4. Optimal outlier removal in high-dimensional [C] . John Dunagan, Santosh Vempala, PSantosh Vempala Annual ACM symposium on Theory of computing;ACM symposium on Theory of computing . 2001

机译：高维中的最佳离群值去除
5. High-dimensional data mining: Subspace clustering, outlier detection and applications to classification. [D] . Foss, Andrew Philip Ogilvie. 2010

机译：高维数据挖掘：子空间聚类，离群值检测和分类应用。
6. Iterative outlier removal: A method for identifying outliers in laboratory recalibration studies [O] . Christina M. Parrinello, Morgan E. Grams, Yingying Sang, -1

机译：迭代离群值消除：一种在实验室重新校准研究中识别离群值的方法
7. Optimal outlier removal in high-dimensional spaces [O] . Dunagan John, Vempala Santosh 2004

机译：高维空间中的最佳异常值消除

Optimal outlier removal in high-dimensional spaces

摘要

著录项

相似文献

相关主题

期刊订阅