首页> 外文OA文献 >Filtering spam mail in non-segmented languages using hybrid approach: the integration of stopword removal, n-gram extraction and classification techniques
【2h】

Filtering spam mail in non-segmented languages using hybrid approach: the integration of stopword removal, n-gram extraction and classification techniques

机译:使用混合方法以非分段语言过滤垃圾邮件:停用词删除,n-gram提取和分类技术的集成

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Junk mail or spam mail has been regarded as a major problem in today’s world. The spam mail can lead to cybercrime that impacts all individuals and organization.Many people and businesses seek for spam mail prevention technique in order to protect their own data and computer system.The spam mails normally contain advertise products or services contents and also conveys viruses, malwares, spywares and so forth.Many people thought spam mails do not cause any damage. In fact, the spam mails made a management cost increased and resources will be used ineffectively.Therefore, verifying and filtering spam mails need to be taken into consideration. The objective of this paper is to introduce the hybrid approach, which combines three techniques including stop-word removal, n-gram extraction and data classification, for filtering spam emails and simplifies system development.The proposed hybrid approach can be widely applied for all different languages due to being language independent technique. To examine the approach, CSDMC2010 spam mail corpus comprising of 198 common emails, 202 spam mails, and 10 selective emails were used in experimental study.The results showed that the proposed technique enabled to monitor whether the email is spam with 93.2% accuracy.Hence, this hybrid approach could provide benefits for all users and organization to decrease the computer risk.
机译:垃圾邮件或垃圾邮件已被视为当今世界的主要问题。垃圾邮件会导致影响所有个人和组织的网络犯罪。许多人和企业都在寻求垃圾邮件防护技术,以保护自己的数据和计算机系统。垃圾邮件通常包含广告产品或服务内容,还传播病毒,恶意软件,间谍软件等。许多人认为垃圾邮件不会造成任何损害。实际上,垃圾邮件增加了管理成本,资源利用效率不高。因此,需要考虑对垃圾邮件进行验证和过滤。本文的目的是介绍一种混合方法,该方法结合了停用词去除,n-gram提取和数据分类这三种技术,用于过滤垃圾邮件并简化了系统开发。所提出的混合方法可以广泛应用于所有不同的方法语言是独立于语言的技术。为了检验该方法,在实验研究中使用了由198个普通电子邮件,202个垃圾邮件和10个选择性电子邮件组成的CSDMC2010垃圾邮件主体,结果表明,该技术能够以93.2%的准确度监视该电子邮件是否为垃圾邮件。 ,这种混合方法可以为所有用户和组织带来好处,从而降低计算机风险。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号