首页> 美国卫生研究院文献>BioData Mining >LVQ-SMOTE – Learning Vector Quantization based Synthetic Minority Over–sampling Technique for biomedical data
【2h】

LVQ-SMOTE – Learning Vector Quantization based Synthetic Minority Over–sampling Technique for biomedical data

机译:LVQ-SMOTE –基于学习矢量量化的生物医学数据合成少数族群过采样技术

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundOver-sampling methods based on Synthetic Minority Over-sampling Technique (SMOTE) have been proposed for classification problems of imbalanced biomedical data. However, the existing over-sampling methods achieve slightly better or sometimes worse result than the simplest SMOTE. In order to improve the effectiveness of SMOTE, this paper presents a novel over-sampling method using codebooks obtained by the learning vector quantization. In general, even when an existing SMOTE applied to a biomedical dataset, its empty feature space is still so huge that most classification algorithms would not perform well on estimating borderlines between classes. To tackle this problem, our over-sampling method generates synthetic samples which occupy more feature space than the other SMOTE algorithms. Briefly saying, our over-sampling method enables to generate useful synthetic samples by referring to actual samples taken from real-world datasets.
机译:背景技术针对不平衡生物医学数据的分类问题,提出了一种基于合成少数族群过采样技术(SMOTE)的过采样方法。但是,现有的过采样方法比最简单的SMOTE效果稍好或有时更差。为了提高SMOTE的有效性,本文提出了一种使用通过学习矢量量化获得的码本的过采样方法。通常,即使将现有的SMOTE应用于生物医学数据集,其空的特征空间仍然非常巨大,以至于大多数分类算法在估计类之间的边界时效果也不佳。为了解决这个问题,我们的过采样方法生成的合成样本比其他SMOTE算法占用更多的特征空间。简而言之,我们的过采样方法能够通过参考从真实数据集中获取的实际样本来生成有用的合成样本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号