【24h】

Hybrid spamicity score approach to web spam detection

机译:混合垃圾邮件评分方法可检测Web垃圾邮件

获取原文
获取原文并翻译 | 示例

摘要

Web spamming refers to actions intended to mislead search engines and give some pages higher ranking than they deserve. Fundamentally, Web spam is designed to pollute search engines and corrupt the user experience by driving traffic to particular spammed Web pages, regardless of the merits of those pages. Recently, there is dramatic increase in amount of web spam, leading to a degradation of search results. Most of the existing web spam detection methods are supervised that require a large set of training web pages. The proposed system studies the problem of unsupervised web spam detection. It introduces the notion of spamicity to measure how likely a page is spam. Spamicity is a more flexible measure than the traditional supervised classification methods. In the proposed system link and content spam techniques are used to determine the spamicity score of web page. A threshold is set by empirical analysis which classifies the web page into spam or non spam.
机译:网络垃圾邮件是指旨在误导搜索引擎并为某些页面提供比其应有的排名更高的行为。从根本上讲,Web垃圾邮件旨在通过将流量吸引到特定的垃圾网页上来污染搜索引擎并破坏用户体验,而与这些网页的优点无关。最近,网络垃圾邮件的数量急剧增加,导致搜索结果下降。对大多数现有的Web垃圾邮件检测方法进行监督,这需要大量的培训Web页面。所提出的系统研究了无监督的Web垃圾邮件检测问题。它引入了“垃圾邮件”概念来衡量页面被垃圾邮件的可能性。垃圾邮件是一种比传统的监督分类方法更灵活的度量。在所提出的系统中,链接和内容垃圾邮件技术用于确定网页的垃圾邮件分数。通过经验分析设置阈值,该阈值将网页分为垃圾邮件或非垃圾邮件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号