首页> 外文期刊>Pattern recognition letters >Time-efficient spam e-mail filtering using n-gram models
【24h】

Time-efficient spam e-mail filtering using n-gram models

机译:使用n-gram模型的省时垃圾邮件过滤

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose spam e-mail filtering methods having high accuracies and low time complexities. The methods are based on the n-gram approach and a heuristics which is referred to as the first n-words heuristics. We develop two models, a class general model and an e-mail specific model, and test the methods under these models. The models are then combined in such a way that the latter one is activated for the cases the first model falls short. Though the approach proposed and the methods developed are general and can be applied to any language, we mainly apply them to Turkish, which is an agglutinative language, and examine some properties of the language. Extensive tests were performed and success rates about 98% for Turkish and 99% for English were obtained. It has been shown that the time complexities can be reduced significantly without sacrificing performance.
机译:在本文中,我们提出了具有高准确性和低时间复杂度的垃圾邮件过滤方法。所述方法基于n元语法方法和启发式方法,其被称为前n个单词启发式方法。我们开发了两个模型,一个类通用模型和一个特定于电子邮件的模型,并测试了这些模型下的方法。然后以一种方式组合模型,以便在第一个模型不足的情况下激活后者。尽管所提出的方法和开发的方法是通用的并且可以应用于任何语言,但我们主要将它们应用于作为凝集性语言的土耳其语,并研究该语言的某些属性。进行了广泛的测试,获得了大约98%的土耳其语和99%的英语成功率。已经表明,可以在不牺牲性能的情况下显着降低时间复杂度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号