首页> 外文会议>International conference of the Italian Association for Artificial Intelligence >Spam Filtering Using Regularized Neural Networks with Rectified Linear Units
【24h】

Spam Filtering Using Regularized Neural Networks with Rectified Linear Units

机译:使用带有整流线性单元的正则神经网络进行垃圾邮件过滤

获取原文

摘要

The rapid growth of unsolicited and unwanted messages has inspired the development of many anti-spam methods. Machine-learning methods such as Naive Bayes (NB), support vector machines (SVMs) or neural networks (NNs) have been particularly effective in categorizing spam on-spam messages. They automatically construct word lists and their weights usually in a bag-of-words fashion. However, traditional multilayer perception (MLP) NNs usually suffer from slow optimization convergence to a poor local minimum and overfitting issues. To overcome this problem, we use a regularized NN with rectified linear units (RANN-ReL) for spam filtering. We compare its performance on three benchmark spam datasets (Enron, SpamAssassin, and SMS spam collection) with four machine algorithms commonly used in text classification, namely NB, SVM, MLP, and k-NN. We show that the RANN-ReL outperforms other methods in terms of classification accuracy, false negative and false positive rates. Notably, it classifies well both major (legitimate) and minor (spam) classes.
机译:不请自来和不需要的消息的迅速增长激发了许多反垃圾邮件方法的发展。诸如朴素贝叶斯(NB),支持向量机(SVM)或神经网络(NN)之类的机器学习方法在对垃圾邮件/非垃圾邮件消息进行分类时特别有效。他们通常以词袋方式自动构造单词列表及其权重。但是,传统的多层感知(MLP)神经网络通常会遇到优化收敛缓慢,局部最小值极低以及过度拟合的问题。为了克服这个问题,我们使用带有校正线性单元(RANN-ReL)的正则化NN进行垃圾邮件过滤。我们将其在三种基准垃圾邮件数据集(安然,SpamAssassin和SMS垃圾邮件收集)上的性能与文本分类中常用的四种机器算法(即NB,SVM,MLP和k-NN)进行比较。我们显示,在分类准确度,假阴性和假阳性率方面,RANN-ReL优于其他方法。值得注意的是,它对主要(合法)和次要(垃圾邮件)类别都进行了很好的分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号