首页> 外文期刊>Pattern recognition letters >Reverse-nearest neighborhood based oversampling for imbalanced, multi-label datasets
【24h】

Reverse-nearest neighborhood based oversampling for imbalanced, multi-label datasets

机译:基于反向的基于邻域的非衡度,多标签数据集的过采样

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we present a novel reverse-nearest neighborhood based oversampling scheme for the imbalanced labels of a multi-label dataset. Reverse nearest neighborhood of a query point includes all those points which contain the query point as one of their neighbor. It facilitates us to identify an adaptive number of neighbors (according to the density and distribution of points) instead of a fixed number of neighbors. We add label-specific synthetic minority instances in the reverse nearest neighborhood of the minority points of each label. Reverse nearest neighbor configuration also detects the singular minority points, which we avoid as seed points in the oversampling phase. On the oversampled data of each label, we train and invoke a Linear Support Vector Machine to complete the learning and testing. Results of the proposed method against comparing methods on class-imbalance focused metrics indicates its competence in handling differently imbalanced multi-label datasets. (C) 2019 Elsevier B.V. All rights reserved.
机译:在本文中,我们为多标签数据集的不平衡标签提供了一种新的基于反向邻域的过采样方案。反向查询点的最近邻域包括包含查询点作为其邻居的所有这些点。它有助于我们识别自适应数量的邻居(根据点的密度和分布)而不是固定数量的邻居。我们在每个标签的少数群体点的反向最近邻域中添加特定于标签的合成少数群体实例。反向最近邻配置还检测到奇异的少数点,我们避免作为过采样阶段中的种子点。在每个标签的过采样数据上,我们训练并调用线性支持向量机以完成学习和测试。 Clouding方法对比较方法的方法 - 不平衡聚焦度量的比较方法表明其在处理不同不平衡的多标签数据集时的能力。 (c)2019 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号