首页> 外文期刊>Information retrieval >A unified score propagation model for web spam demotion algorithm
【24h】

A unified score propagation model for web spam demotion algorithm

机译:Web垃圾邮件降级算法的统一分数传播模型

获取原文
获取原文并翻译 | 示例
           

摘要

Web spam pages exploit the biases of search engine algorithms to get higher than their deserved rankings in search results by using several types of spamming techniques. Many web spam demotion algorithms have been developed to combat spam via the use of the web link structure, from which the goodness or badness score of each web page is evaluated. Those scores are then used to identify spam pages or punish their rankings in search engine results. However, most of the published spam demotion algorithms differ from their base models by only very limited improvements and still suffer from some common score manipulation methods. The lack of a general framework for this field makes the task of designing high-performance spam demotion algorithms very inefficient. In this paper, we propose a unified score propagation model for web spam demotion algorithms by abstracting the score propagation process of relevant models with a forward score propagation function and a backward score propagation function, each of which can further be expressed as three sub-functions: a splitting function, an accepting function and a combination function. On the basis of the proposed model, we develop two new web spam demotion algorithms named Supervised Forward and Backward score Ranking (SFBR) and Unsupervised Forward and Backward score Ranking (UFBR). Our experiments, conducted on three large-scale public datasets, show that (1) SFBR is very robust and apparently outperforms other algorithms and (2) UFBR can obtain results comparable to some well-known supervised algorithms in the spam demotion task even if the UFBR is unsupervised.
机译:Web垃圾邮件页面利用多种类型的垃圾邮件发送技术,利用搜索引擎算法的偏见使其在搜索结果中的排名高于其应有的排名。已经开发了许多Web垃圾邮件降级算法,以通过使用Web链接结构来打击垃圾邮件,从中可以评估每个网页的优缺点。这些分数然后用于识别垃圾邮件页面或惩罚其在搜索引擎结果中的排名。但是,大多数已发布的垃圾邮件降级算法与基本模型的区别仅在于非常有限的改进,并且仍然遭受一些常见的分数操纵方法的困扰。该领域缺乏通用框架,使得设计高性能垃圾邮件降级算法的任务非常低效。本文通过将相关模型的得分传播过程抽象为前向得分传播函数和后向得分传播函数,为网络垃圾邮件降级算法提出了一个统一的得分传播模型,它们各自可以进一步表示为三个子函数:分割功能,接受功能和组合功能。在提出的模型的基础上,我们开发了两种新的Web垃圾邮件降级算法,分别称为“监督前向和后向分数排名(SFBR)”和“无监督前向和后向分数排名(UFBR)”。我们在三个大型公共数据集上进行的实验表明,(1)SFBR非常强大,并且明显优于其他算法;(2)UFBR在垃圾邮件降级任务中可以获得与某些知名监督算法相当的结果,即使UFBR是不受监督的。

著录项

  • 来源
    《Information retrieval》 |2017年第6期|547-574|共28页
  • 作者单位

    Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Sichuan, Peoples R China;

    Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Sichuan, Peoples R China;

    Feng Chia Univ, Dept Informat Engn & Comp Sci, Taichung 40724, Taiwan|Asia Univ, Dept Comp Sci & Informat Engn, Taichung 41354, Taiwan;

    Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Sichuan, Peoples R China;

    Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Sichuan, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Web spam demotion; Web ranking algorithms; Spam detection; Web scoring system;

    机译:Web垃圾邮件降级Web排名算法垃圾邮件检测Web评分系统;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号