首页> 外文会议>Online World Conference on Soft Computing in Industrial Applications >Performance Analysis of Naieve Bayes Classification, Support Vector Machines and Neural Networks for Spam Categorization
【24h】

Performance Analysis of Naieve Bayes Classification, Support Vector Machines and Neural Networks for Spam Categorization

机译:天真贝叶斯分类的性能分析,支持向量机和垃圾邮件分类的神经网络

获取原文

摘要

Spam mail recognition is a new growing field which brings together the topic of natural language processing and machine learning as it is in essence a two class classification of natural language texts. An important feature of spam recognition is that it is a cost-sensitive classification: misclassification of a non-spam mail as spam is generally a more severe error than misclassifying a spam mail as non-spam. In order to be compared, the methods applied to this field should be all evaluated with the same corpus and within the same cost-sensitive framework. In this paper, the performances of Support Vector Machines (SVM), Neural Networks (NN) and Naieve Bayes (NB) techniques are compared using a publicly available corpus (LINGSPAM) for different cost scenarios. The training time complexities of the methods are also evaluated. The results show that NN has significantly better performance than the two other, having acceptable training times. NB gives better results than SVM when the cost is extremely high while in all other cases SVM outperforms NB.
机译:垃圾邮件识别是一种新的成长领域,它汇集了自然语言处理和机器学习的主题,因为它实质上是自然语言文本的两类分类。垃圾邮件识别的一个重要特征是,它是一个成本敏感的分类:将非垃圾邮件的错误分类为垃圾邮件通常是比错误分类为非垃圾邮件的错误错误。为了进行比较,应用于该领域的方法应该以相同的语料库和相同的成本敏感框架评估。本文使用公开的语料库(LINGSPAM)对不同成本场景进行比较了支持向量机(SVM),神经网络(NN)和明智贝叶斯(NB)技术的性能。还评估了该方法的训练时间复杂性。结果表明,NN具有比另外两个更好的性能,具有可接受的培训时间。当在所有其他情况下,当成本极高时,NB提供比SVM更高的结果,同时在所有其他情况下SVM优于NB。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号