KNN Based Outlier Detection Algorithm in Large Dataset

机译：大数据集中基于KNN的离群值检测算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

An outlier is the object which is very different from the rest of the dataset on some measure. Finding such exception has received much attention in the data mining field. In this paper, we propose a KNN based outlier detection algorithm which is consisted of two phases. Firstly, it partitions the dataset into several clusters and then in each cluster, it calculates the Kth nearest neighborhood for object to find outliers. In addition, the pruning scheme is used in our algorithm. It can effectively avoid frequent passing the entire dataset and unnecessary computations. Experimental results on both synthetic and real life datasets show that our algorithm is efficient for outlier detection in large dataset.

机译：离群值是在某种程度上与数据集其余部分完全不同的对象。查找此类异常已在数据挖掘领域引起了很多关注。在本文中，我们提出了一种基于KNN的离群值检测算法，该算法由两个阶段组成。首先，它将数据集划分为几个聚类，然后在每个聚类中，计算对象的第K个最近邻域以找到离群值。另外，在我们的算法中使用了修剪方案。它可以有效避免频繁传递整个数据集和不必要的计算。在合成数据集和现实数据集上的实验结果表明，我们的算法对于大型数据集的离群值检测是有效的。

著录项

来源
《Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008》|2009年|611-613|共3页
会议地点
作者
Peng Yang; Biao Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类地球物理学;
关键词
data mining; knn; outlier detection;

机译：数据挖掘; knn;离群值检测;

相似文献

外文文献
中文文献
专利

1. EBOD: An ensemble-based outlier detection algorithm for noisy datasets [J] . Ouyang Boya, Song Yu, Li Yuhai, Knowledge-Based Systems . 2021,第Nova14期

机译：EBOD：基于集合的噪声数据集的异常检测算法
2. Double-clustering Based Outlier Detection Algorithm for Large Datasets [J] . Qian Wang, Min Zheng, Qingsheng Zhu Journal of information and computational science . 2011,第8期

机译：基于双聚类的大数据集离群值检测算法
3. KNN-Based Approximate Outlier Detection Algorithm Over IoT Streaming Data [J] . Zhu Rui, Ji Xiaoling, Yu Danyang, Quality Control, Transactions . 2020,第期

机译：基于KNN的近似异常值IOT流数据
4. An Improved KNN Based Outlier Detection Algorithm for Large Datasets [C] . Qian Wang, Min Zheng International conference on advanced data mining and applications;ADMA 2010 . 2010

机译：改进的基于KNN的大数据集离群值检测算法
5. DNIDS: A Dependable Network Intrusion Detection System using the CSI-KNN algorithm. [D] . Kuang, Liwei (Vivian). 2007

机译：DNIDS：使用CSI-KNN算法的可靠网络入侵检测系统。
6. Correction: GTI: A Novel Algorithm for Identifying Outlier Gene Expression Profiles from Integrated Microarray Datasets [O] . John Patrick Mpindi, Henri Sara, Saija Haapa-Paananen, 2011

机译：校正：GTI：一种新型算法从集成微阵列数据识别离群值基因表达谱
7. KNN-Based Approximate Outlier Detection Algorithm Over IoT Streaming Data [O] . Rui Zhu, Xiaoling Ji, Danyang Yu, 2020

机译：基于KNN的近似异常值IOT流数据

KNN Based Outlier Detection Algorithm in Large Dataset

摘要

著录项

相似文献

相关主题

期刊订阅