首页> 外文期刊>International journal of organizational and collective intelligence >Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer
【24h】

Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer

机译:使用随机冲浪者收集的信息检测垃圾邮件的受益者

获取原文
获取原文并翻译 | 示例
           

摘要

Search engines use several criteria to rank webpages and choose which pages to display when answering a request. Those criteria can be separated into two notions, relevance and popularity. The notion of popularity is calculated by the search engine and is related to links made to the webpage. Malicious webmasters want to artificially increase their popularity; the techniques they use are often referred to as Webspam. It can take many forms and is in constant evolution, but Webspam usually consists of building a specific dedicated structure of spam pages around a given target page. It is important for a search engine to address the issue of Webspam; otherwise, it cannot provide users with fair and reliable results. In this paper, the authors propose a technique to identify Webspam through the frequency language associated with random walks among those dedicated structures. The authors identify the language by calculating the frequency of appearance ofk-grams on random walks launched from every node.
机译:搜索引擎使用多种条件对网页进行排名,并选择在回答请求时显示哪些页面。这些标准可以分为两个概念:相关性和受欢迎度。流行度概念是由搜索引擎计算的,并且与指向网页的链接有关。恶意的网站管理员希望人为地提高其知名度;他们使用的技术通常称为Webspam。它可以采取多种形式并且在不断发展,但是Webspam通常包括围绕给定的目标页面构建特定的垃圾邮件页面专用结构。搜索引擎必须解决Webspam的问题,这一点很重要。否则,它不能为用户提供公正可靠的结果。在本文中,作者提出了一种通过与那些专用结构中的随机游走相关的频率语言来识别Web垃圾邮件的技术。作者通过计算从每个节点发起的随机游动的k-gram出现频率来识别语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号