首页> 外文会议>International Conference on Advanced Computer Science and Information Systems >Multiclass SMS message categorization: Beyond spam binary classification
【24h】

Multiclass SMS message categorization: Beyond spam binary classification

机译:Multiclass SMS消息分类:超出垃圾邮件二进制分类

获取原文

摘要

SMS spam has been growing since mobile phone usage increases. Past researches on SMS spam detection only classified SMS into two categories, spam and not spam. The binary classification of SMS spam prevents the user from seeing the spam messages that they do not really hate, e.g. an advertisement from their favorite product In this paper, we propose multi-class classification of SMS into: regular, info, ads, and fraud. We use content-based (top-N unigram) as well as non-content based features. The result shows that the best accuracy is achieved by logistic regression that is 97.5 % accuracy with configuration of normalization preprocess and 4096 top-N unigram features.
机译:由于移动电话使用量增加,短信垃圾邮件一直在增长。过去对SMS垃圾邮件检测的研究仅将SMS分为两类,垃圾邮件,而不是垃圾邮件。 SMS垃圾邮件的二进制分类可防止用户看到它们并不真正讨厌的垃圾邮件,例如,从本文中的最喜欢的产品广告,我们向SMS提出了多级分类:常规,信息,广告和欺诈。我们使用基于内容(Top-N UNIGRAM)以及基于非内容的功能。结果表明,最佳精度是通过逻辑回归实现的,即归一化预处理配置和4096个顶-n Unigram特征的精度为97.5 %。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号