首页> 外文会议> >The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method
【24h】

The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method

机译:广义凝聚最近邻规则作为一种数据约简方法

获取原文

摘要

In this paper, we propose a new data reduction algorithm that iteratively selects some samples and ignores others that can be absorbed, or represented, by those selected. This algorithm differs from the condensed nearest neighbor (CNN) rule in its employment of a strong absorption criterion, in contrast to the weak criterion employed by CNN; hence, it is called the generalized CNN (GCNN) algorithm. The new criterion allows GCNN to incorporate CNN as a special case, and can achieve consistency, or asymptotic Bayes-risk efficiency, under certain conditions. GCNN, moreover, can yield significantly better accuracy than other instance-based data reduction methods. We demonstrate the last claim through experiments on five datasets, some of which contain a very large number of samples
机译:在本文中,我们提出了一种新的数据约简算法,该算法可迭代地选择一些样本,而忽略可以被所选样本吸收或表示的其他样本。与CNN所采用的弱准则相比,该算法在采用强吸收准则方面不同于浓缩最近邻(CNN)规则;因此,它被称为广义CNN(GCNN)算法。新标准允许GCNN在特殊情况下合并CNN,并可以在某些条件下实现一致性或渐近贝叶斯风险效率。而且,与其他基于实例的数据缩减方法相比,GCNN可以产生更好的准确性。我们通过对五个数据集进行实验来证明最后的主张,其中一些数据集包含大量样本

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号