首页> 外文期刊>Expert systems with applications >Applying lazy learning algorithms to tackle concept drift in spam filtering
【24h】

Applying lazy learning algorithms to tackle concept drift in spam filtering

机译:应用懒惰学习算法来解决垃圾邮件过滤中的概念漂移

获取原文
获取原文并翻译 | 示例

摘要

A great amount of machine learning techniques have been applied to problems where data is collected over an extended period of time. However, the disadvantage with many real-world applications is that the distribution underlying the data is likely to change over time. In these situations, a problem that many global eager learners face is their inability to adapt to local concept drift. Concept drift in spam is particularly difficult as the spammers actively change the nature of their messages to elude spam filters. Algorithms that track concept drift must be able to identify a change in the target concept (spam or legitimate e-mails) without direct knowledge of the underlying shift in distribution. In this paper we show how a previously successful instance-based reasoning e-mail filtering model can be improved in order to better track concept drift in spam domain. Our proposal is based on the definition of two complementary techniques able to select both terms and e-mails representative of the current situation. The enhanced system is evaluated against other well-known successful lazy learning approaches in two scenarios, all within a cost-sensitive framework. The results obtained from the experiments carried out are very promising and back up the idea that instance-based reasoning systems can offer a number of advantages tackling concept drift in dynamic problems, as in the case of the anti-spam filtering domain.
机译:大量的机器学习技术已应用于在较长时间内收集数据的问题。但是,许多实际应用程序的缺点是数据基础的分布可能会随时间变化。在这种情况下,许多全球渴望学习的人面临的问题是他们无法适应本地概念的漂移。垃圾邮件的概念漂移特别困难,因为垃圾邮件发送者会主动更改其邮件的性质以逃避垃圾邮件过滤器。跟踪概念漂移的算法必须能够识别目标概念(垃圾邮件或合法电子邮件)的变化,而无需直接了解分发的根本变化。在本文中,我们展示了如何改进以前成功的基于实例的推理电子邮件过滤模型,以便更好地跟踪垃圾邮件域中的概念漂移。我们的建议基于两种互补技术的定义,该技术能够选择代表当前情况的术语和电子邮件。在两种情况下,均在成本敏感的框架内,针对其他知名的成功懒惰学习方法对增强的系统进行了评估。从进行的实验中获得的结果是非常有希望的,并且支持基于实例的推理系统可以提供许多优点,以解决动态问题中的概念漂移,例如在反垃圾邮件过滤域中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号