Comparison between Classifier's Accuracies Based on Different Outlier Methods Generated by Frequent and Infrequent Categorical Data

机译：基于频繁和不常见的分类数据生成的不同异常方法的分类器准确性的比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Outlier analysis is an essential task in data science to find out inconsistencies in data, to build a good classifier and in decision making. Finding outliers from categorical data is a tough task. In this work, a comparative study is made between classifier accuracies which are built by different outlier analysis methods generated by frequent and infrequent itemsets from categorical data. In modeling a classifier for categorical data, high frequent records are most useful and the infrequent records are obstacles in modeling the classifiers. The experiments are done on Bank dataset and Nursery dataset, taken from UCI ML Repository to compare the available methods with the proposed method. For normally distributed OFI, the number of outliers to be eliminated need not be given as input since it generates the number of outliers automatically. However the threshold value is needed to be given to generate infrequent item sets for NOFI.

机译：异常值分析是数据科学中的重要任务，以找出数据中的不一致性，建立一个良好的分类器和决策。从分类数据中查找异常值是一项艰巨的任务。在这项工作中，对比较研究是在分类器精度之间进行的，这些准确性由不同的异常分析方法构建，由频繁和不经常从分类数据中集成的项目集产生。在为分类数据建模分类器中，高频率记录最有用，不经常的记录是模拟分类器的障碍。实验是在银行数据集和托儿所数据集中完成的，从UCI M1存储库中获取，以将可用方法与所提出的方法进行比较。对于正常分布的OFI，未删除的异常值的数量不需要作为输入给出，因为它会自动生成异常值。然而，需要阈值来为NOFI生成不经常的项目集。

著录项

来源
《IEEE International Conference on Advanced Information Networking and Applications》|2016年||共6页
会议地点
作者
B. Raveendra Babu; Lakshmi Sreenivasa Reddy D.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
AVF Score; BAD Score; FAVF Score; FBAD Score; NAVF Score; NBAD Score; NOFI Score; OFI Score; Outlier analysis;

机译：AVF得分;不良分数;FAVF得分;FBAD得分;NAVF得分;NBAD得分;NOFI得分;IOI得分;异常分析;

相似文献

外文文献
中文文献
专利

1. UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams [J] . Cai Saihua, Li Li, Li Qian, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2020,第10期

机译：UWFP - 异常值：基于有效的基于频繁模式的异常转速检测方法，用于不确定加权数据流
2. Generalised linear model-based algorithm for detection of outliers in environmental data and comparison with semi-parametric outlier detection methods [J] . Martina ?ampulová, Jaroslav Michálek, Ji?í Mou?ka Atmospheric Pollution Research . 2019,第4期

机译：基于线性模型的基于线性模型的算法，用于检测环境数据中的异常值和半参数异常检测方法的比较
3. Using Ensemble StackingC Method and Base Classifiers to Ameliorate Prediction Accuracy of Pedagogical Data [J] . Mudasir Ashraf, Majid Zaman, Muheet Ahmed Procedia Computer Science . 2018,第1期

机译：使用集成StackingC方法和基分类器改善教学数据的预测准确性
4. Comparison between Classifier's Accuracies Based on Different Outlier Methods Generated by Frequent and Infrequent Categorical Data [C] . B. Raveendra Babu, Lakshmi Sreenivasa Reddy D. IEEE International Conference on Advanced Information Networking and Applications . 2016

机译：基于频繁和不频繁分类数据生成的不同离群方法的分类器精度比较
5. Comparison of the accuracy of fit of CAD/CAM crowns using three different data acquisition methods. [D] . Jokhadar, Hossam Faisal. 2013

机译：使用三种不同的数据采集方法比较CAD / CAM牙冠的贴合精度。
6. A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches [O] . Dayle L. Sampson, Tony J. Parker, Zee Upton, 2011

机译：进行分类临床样品方法的比较基于蛋白质组学数据：案例研究统计和机器学习方法
7. A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches [O] . Sampson, Dayle L., Parker, Tony J., Upton, Zee, 2011

机译：基于蛋白质组学数据的临床样本分类方法的比较：以统计和机器学习方法为例
8. An Extended Kalman Filter for frequent local and infrequent global sensor data fusion [R] . Stergios I. Roumeliotis, George A. Bekey 1997

机译：用于频繁局部和不频繁全局传感器数据融合的扩展卡尔曼滤波器

Comparison between Classifier's Accuracies Based on Different Outlier Methods Generated by Frequent and Infrequent Categorical Data

摘要

著录项

相似文献

相关主题

期刊订阅