首页> 中文期刊> 《计算机工程》 >基于冗余实例对消除算法的实例选择

基于冗余实例对消除算法的实例选择

             

摘要

Instance selection is a kind of effective method to remove the noise and redundant data. According to the unbalance between the generalization ability and reduction in existing instance selection methods, this paper proposes a new instance selection method:Redundant Instance Pair Elimination(RIPE) algorithm. It gives the concept of nearest similar pair, calculates the nearest similar pair of datasets, and removes the eligible instances. The simulation experimental results in 11 different datasets show that the classification accuracy and storage compression ratio of processed dataset are obviously improved compared with original datasets. Contrasted with Edited Nearest Neighbor rule(ENN) algorithm, this algorithm can keep the classification accuracy, improve more than 35% in average storage compression ratio, keep intact the data distribution of original datasets, and make better compromise in the classification accuracy and the storage compression ratio.%实例选择能有效移除数据中的噪声和冗余数据,但现有方法难以在提高泛化能力的同时实现约简。针对该问题,提出一种冗余实例对消除算法用于实例选择。给出最近同类实例对的概念,计算数据集中存在的最近同类实例对,并移除满足条件的实例,在11个不同数据集上进行的仿真实验结果表明,经过该算法处理后的数据集在分类准确率和存储压缩率上较原始样本集有明显提升。对比剪辑最近邻规则算法,该算法能够在保持分类准确率的同时提高平均存储压缩率35%以上,并完整保留原始样本集的数据分布特征,在分类准确率和存储压缩率上取得折中。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号