首页> 外文会议>International World Wide Web Conference >Topical TrustRank: Using Topicality to Combat Web Spam
【24h】

Topical TrustRank: Using Topicality to Combat Web Spam

机译:局部速降:使用题址来打击Web垃圾邮件

获取原文

摘要

Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the sense that the seed set used by TrustRank may not be sufficiently representative to cover well the different topics on the Web. Also, for a given seed set, TrustRank has a bias towards larger communities. We propose the use of topical information to partition the seed set and calculate trust scores for each topic separately to address the above issues. A combination of these trust scores for a page is used to determine its ranking. Experimental results on two large datasets show that our Topical TrustRank has a better performance than TrustRank in demoting spam sites or pages. Compared to TrustRank, our best technique can decrease spam from the top ranked sites by as much as 43.1%.
机译:Web垃圾邮件是尝试欺骗搜索引擎排名算法的行为。 Trustrank是最近可以打击Web垃圾邮件的算法。然而,史师在史师使用的种子集可能不是足够的代表以覆盖网络上的不同主题的感觉中易受攻击。此外,对于给定的种子套装,Trustrank对更大的社区具有偏见。我们建议使用主题信息来分区种子集并分别计算每个主题的信任分数,以解决上述问题。页面的这些信任分数的组合用于确定其排名。两个大型数据集的实验结果表明,我们的局部速降在降级垃圾邮件站点或页面中的算命具有更好的性能。与速降相比,我们最好的技术可以将垃圾邮件从顶部排名的地点减少到43.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号