首页> 外文会议>International Conference on Fuzzy Systems and Knowledge Discovery >A New Feature Weighting Method Based on Probability Distribution in Imbalanced Text Classification
【24h】

A New Feature Weighting Method Based on Probability Distribution in Imbalanced Text Classification

机译:基于概率分布的新特征加权方法在不平衡文本分类中

获取原文

摘要

Many real-world text classification tasks involve imbalanced training examples. Categories with fewer examples are under-represented and their classifiers often perform far below satisfactory. We propose a new approach using a probability distribution to assign the feature weight and apply it to Naive Bayes classifier. The method is evaluated in our experiments on FuDan Chinese Corpus. The experimental result shows significant improvement for imbalanced datasets while the performance for balanced datasets is not jeopardized. Our approach has suggested a simple and effective solution to boost the performance of text classification over skewed datasets.
机译:许多现实世界文本分类任务涉及不平衡的培训示例。具有较少示例的类别是代表性的,其分类器通常远低于令人满意的。我们提出了一种使用概率分布来分配特征权重的新方法,并将其应用于朴素的贝叶斯分类器。在我们对复旦文中文语料库的实验中评估了该方法。实验结果表明了不平衡数据集的显着改进,而平衡数据集的性能不会受到危及。我们的方法提出了一种简单有效的解决方案,可以提高文本分类对偏斜数据集的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号