首页> 外文会议>Fourth International Conference on Genetic and Evolutionary Computing >A Keyword Based Strategy for Spam Topic Discovery from the Internet
【24h】

A Keyword Based Strategy for Spam Topic Discovery from the Internet

机译:基于关键字的互联网垃圾邮件主题发现策略

获取原文

摘要

The increasing volume of spam has become a serious threat not only to the Internet, but also to the society. However, itȁ9;s a great challenge to discover the spam from the Internet effectively and efficiently. Content-based filtering is one of the mainstream methods to solve the problem. This paper proposed a content based spam topic detection strategy through keyword extraction. In particular, spam topic is detected by using the topic model of multiple features with the keywords of clues, which integrate the corresponding feature of News, BBS and Blog. We get the min cost of 0.282 through TDT4 evaluating corpus and the satisfaction of 93.3% through the golaxy public opinion monitoring system of ICT, which is more effective than traditional method. The Experiments show that this algorithm is effective for spam topic detection.
机译:垃圾邮件数量的增加不仅对互联网,而且对社会都构成严重威胁。但是,要有效地从Internet发现垃圾邮件是一个巨大的挑战。基于内容的过滤是解决该问题的主流方法之一。本文提出了一种基于内容的垃圾邮件主题检测策略,即通过关键词提取。尤其是,垃圾邮件主题是通过使用具有线索关键词的多个功能的主题模型来检测的,该线索模型集成了News,BBS和Blog的相应功能。 TDT4评估语料库的最低成本为0.282,而ICT的古怪舆论监测系统的最低满意度为93.3%,比传统方法更有效。实验表明,该算法对垃圾邮件主题检测是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号