Detecting Outliers in High-Dimensional Datasets with Mixed Attributes

机译：检测具有混合属性的高维数据集中的异常值

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Outlier Detection has attracted substantial attention in many applications and research areas. Examples include detection of network intrusions or credit card fraud. Many of the existing approaches are based on pair-wise distances among all points in the dataset. These approaches cannot easily extend to current datasets that usually contain a mix of categorical and continuous attributes, and may be scattered over large geographical areas. In addition, current datasets usually have a large number of dimensions. These datasets tend to be sparse, and traditional concepts such as Euclidean distance or nearest neighbor become unsuitable. We propose ODMAD, a fast outlier detection strategy intended for datasets containing mixed attributes. ODMAD takes into consideration the sparseness of the dataset, and is experimentally shown to be highly scalable with the number of points and number of attributes in the dataset.

机译：异常值检测在许多应用和研究领域中引起了大量的关注。例子包括检测网络入侵或信用卡欺诈。许多现有方法基于数据集中所有点之间的成对距离。这些方法不能轻易扩展到通常包含分类和连续属性混合的当前数据集，并且可以在大型地理区域上分散。此外，当前数据集通常具有大量维度。这些数据集往往是稀疏的，传统的概念，如欧几里德距离或最近邻居变得不合适。我们提出ODMAD，一种快速的异常值检测策略，用于包含混合属性的数据集。 ODMAD考虑了数据集的稀疏性，并且通过数据集中的点数和属性数量进行实验显示，可以高度可扩展。

著录项

来源
《International Conference on Data Mining》|2008年||共7页
会议地点
作者
A. Koufakou; M. Georgiopoulos; G. C. Anagnostopoulos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP247.2-53;
关键词
Outlier Detection; Mixed Attribute Datasets; High Dimensional Data; Large Datasets;

机译：异常检测;混合属性数据集;高维数据;大型数据集;
入库时间 2022-08-21 01:06:45

相似文献

外文文献
中文文献
专利

1. Detecting outliers in high-dimensional neuroimaging datasets with robust covariance estimators [J] . FritschV., VaroquauxG., ThyreauB., Medical image analysis . 2012,第7期

机译：使用鲁棒协方差估计器检测高维神经影像数据集中的异常值
2. Finding key knowledge attribute subspace of outliers in high-dimensional dataset [J] . Biao Huang, Peng Yang Expert systems with applications . 2011,第8期

机译：在高维数据集中寻找离群值的关键知识属性子空间
3. Distributed outlier detection in hierarchically structured datasets with mixed attributes [J] . Current Organic Synthesis . 2020,第3期

机译：具有混合属性的分层结构化数据集中的分布式异常检测
4. Detecting Outliers in High-Dimensional Datasets with Mixed Attributes [C] . A. Koufakou, M. Georgiopoulos, G. C. Anagnostopoulos International Conference on Data Mining . 2008

机译：检测具有混合属性的高维数据集中的异常值
5. Scalable and efficient outlier detection in large distributed data sets with mixed-type attributes. [D] . Koufakou, Anna. 2009

机译：具有混合类型属性的大型分布式数据集中的可扩展且高效的离群值检测。
6. A kernel-based approach for detecting outliers of high-dimensional biological data [O] . Jung Hun Oh, Jean Gao 2009

机译：基于内核的高维生物学数据离群值检测方法
7. Detecting Outliers in High-Dimensional Neuroimaging Datasets with Robust Covariance Estimators [O] . Fritsch, Virgile, Varoquaux, Gaël, Thyreau, Benjamin, 2012

机译：使用鲁棒协方差估计器检测高维神经影像数据集中的异常值

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes

摘要

著录项

相似文献

相关主题

期刊订阅