首页> 外文OA文献 >A Novel Technique to Find Outliers in Mixed Attribute Datasets
【2h】

A Novel Technique to Find Outliers in Mixed Attribute Datasets

机译:一种在混合属性数据集中寻找异常值的新方法

摘要

An Outlier is a data point which is significantly different from the remaining data points. Outlier is also referred as discordant, deviants and abnormalities. Outliers may have a particular interest, such as credit card fraud detection, where outliers indicate fraudulent activity. Thus, outlier detection analysis is an interesting data mining task, referred to as outlier analysis. Detecting outliers efficiently from dataset is an important task in many fields like Credit card Fraud, Medicine, Law enforcement, Earth Sciences etc. Many methods are available to identify outliers in numerical dataset. But there exist limited number of methods are available for categorical and mixed attribute datasets. In the proposed work, a novel outlier detection method is proposed. This proposed method finds anomalies based on each record’s “multi attribute outlier factor through correlation” score and it has great intuitive appeal. This algorithm utilizes the frequency of each value in categorical part of the dataset and correlation factor of each record with mean record of the entire dataset. This proposed method used Attribute Value Frequency score (AVF score) concept for categorical part. Results of the proposed method are compared with existing methods. The Bank data (Mixed) is used for experiments in this paper which is taken from UCI machine learning repository.Keyword: Outlier, Mixed Attribute Datasets, Attribute Value Frequency Score
机译:离群值是与其余数据点明显不同的数据点。离群值也称为不和谐,异常和异常。离群值可能具有特殊的意义,例如信用卡欺诈检测,其中离群值指示欺诈活动。因此,异常值检测分析是一项有趣的数据挖掘任务,称为异常值分析。从数据集中有效检测离群值是许多领域的重要任务,例如信用卡欺诈,医学,执法,地球科学等。许多方法可用于识别数值数据集中的离群值。但是,对于分类和混合属性数据集,可用的方法数量有限。在提出的工作中,提出了一种新颖的离群值检测方法。这种提议的方法可以根据每条记录的“通过相关性得出的多属性离群因子”得分来查找异常,它具有极大的直观吸引力。该算法利用数据集分类部分中每个值的频率以及每个记录与整个数据集的平均记录的相关因子。该方法将属性值频率得分(AVF得分)概念用于分类部分。将该方法的结果与现有方法进行了比较。本文中的Bank数据(混合)用于实验,该数据取自UCI机器学习存储库。关键字:离群值,混合属性数据集,属性值频率得分

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号