【24h】

An Instance Selection Algorithm Based on ReliefF

机译:基于Relieff的实例选择算法

获取原文
获取原文并翻译 | 示例
           

摘要

Due to the increasing growth of data, many methods are proposed to extract useful data and remove noisy data. Instance selection is one of these methods which selects some instances of a data set and removes others. This paper proposes a new instance selection algorithm based on ReliefF, which is a feature selection algorithm. In the proposed algorithm, based on the Jaccard index, the nearest instances of each class are found for each instance. Then, based on the nearest neighbor's set, the weight of each instance is calculated. Finally, only instances with more weights are selected. This algorithm can reduce data at a specified rate and have the ability to run parallel on the instances. It can work on a variety of data sets with nominal and numeric data with missing values and is also suitable for working with imbalanced data sets. The proposed algorithm tests on three data sets. Results show that the proposed algorithm can reduce the volume of data, without a significant change in classification accuracy of these datasets.
机译:由于数据增长的增加,提出了许多方法来提取有用的数据并删除嘈杂数据。实例选择是这些方法之一,它选择数据集的某些实例并删除其他实例。本文提出了一种基于Relieff的新实例选择算法,这是一个特征选择算法。在所提出的算法中,基于Jaccard索引,每个实例都找到了每个类的最近实例。然后,基于最近的邻居的集,计算每个实例的权重。最后,仅选择具有更多权重的实例。该算法可以以指定的速率缩短数据,并且能够在实例上并行运行。它可以在具有缺失值的标称和数字数据的各种数据集上工作,也适用于使用不平衡数据集。所提出的算法测试了三个数据集。结果表明,该算法可以减少数据量,无需这些数据集的分类准确性的显着变化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号