首页> 外文期刊>International Journal of Security and Networks >Reversing the effects of tokenisation attacks against content-based spam filters
【24h】

Reversing the effects of tokenisation attacks against content-based spam filters

机译:扭转针对基于内容的垃圾邮件过滤器的令牌化攻击的影响

获取原文
获取原文并翻译 | 示例
           

摘要

More than 85% of the received emails are spam. Many current solutions feature machine-learning algorithms trained using statistical representations of the terms that most commonly appear in such emails. However, there are attacks that can subvert the filtering capabilities of these methods. Tokenisation attacks insert characters within words, subverting these methods. In this paper, we introduce a new method that reverses the effects of tokenisation attacks. Our method processes emails iteratively by considering possible words, starting from the first token and compares the word candidates with a common dictionary to which spam words have been previously added. We provide an empirical study of how tokenisation attacks affect the filtering capability of a Bayesian classifier and we show that our method can reverse the effects of tokenisation attacks.
机译:超过85%的电子邮件是垃圾邮件。当前许多解决方案均采用机器学习算法进行训练,这些算法是使用此类电子邮件中最常见的术语的统计表示进行训练的。但是,有些攻击可能会破坏这些方法的过滤功能。令牌化攻击在单词中插入字符,从而颠覆了这些方法。在本文中,我们介绍了一种新的方法,可以逆转令牌化攻击的影响。我们的方法从第一个令牌开始,通过考虑可能的单词来迭代处理电子邮件,并将候选单词与之前已添加垃圾邮件单词的公共词典进行比较。我们提供了关于令牌化攻击如何影响贝叶斯分类器的过滤能力的经验研究,并且我们证明了我们的方法可以逆转令牌化攻击的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号