首页> 外文会议>Computer Applications in Industry and Engineering >Fuzzy Analysis and Classification of Mislabeled and Noisy Data
【24h】

Fuzzy Analysis and Classification of Mislabeled and Noisy Data

机译:贴错标签的数据的模糊分析和分类

获取原文

摘要

A classifier learns from a "training" data set so it can later correctly classify a new pattern from the same population as the training set. However, when the examples for a learning algorithm consist of real world data then they are usually tainted with noise, ambiguity, uncertainty, imprecision, vagueness or incompleteness. Noise may be introduced by outliers; they are the result of some bad measurements or pattern mislabeling. Clearly, classification of such noisy data must be highly efficient and accurate. In this paper, we overcome this problem by introducing an efficient tool for feature selection where "bad" (non-discriminating) features are dropped and "good" features are weighted according to how well they separate classes in a data set. Good features are responsible for "electing" a class that the feature vector under test should naturally belong to. Thus, we call our new method "EFCLASS" denoting Election Fuzzy Classification. The proposed method is simple, fast and accurate. Various data sets that are known to be good examples for a classification algorithm are used to test the performance of the proposed method for the fuzzy classifier.
机译:分类器从“训练”数据集中学习,以便稍后可以从与训练集相同的总体中正确分类新模式。但是,当学习算法的示例包含现实世界的数据时,它们通常会被噪声,歧义,不确定性,不精确性,模糊性或不完整性所污染。离群值可能会引入噪声;它们是某些不良测量结果或图案错误贴标签的结果。显然,此类噪声数据的分类必须高效且准确。在本文中,我们通过引入一种有效的特征选择工具来克服此问题,在该工具中,将“不良”(无区别)特征丢弃,并根据“好”特征对数据集中的类进行区分的程度进行加权。好的特征负责“选择”被测特征向量自然应属于的类。因此,我们将表示选举模糊分类的新方法称为“ EFCLASS”。该方法简单,快速,准确。已知作为分类算法很好例子的各种数据集可用于测试所提出的模糊分类器方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号