首页> 外文会议>International World Wide Web Conference; Edinburgh(GB) >Topical TrustRank: Using Topicality to Combat Web Spam
【24h】

Topical TrustRank: Using Topicality to Combat Web Spam

机译:主题信任等级:使用主题性来打击网络垃圾邮件

获取原文
获取原文并翻译 | 示例

摘要

Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the sense that the seed set used by TrustRank may not be sufficiently representative to cover well the different topics on the Web. Also, for a given seed set, TrustRank has a bias towards larger communities. We propose the use of topical information to partition the seed set and calculate trust scores for each topic separately to address the above issues. A combination of these trust scores for a page is used to determine its ranking. Experimental results on two large datasets show that our Topical TrustRank has a better performance than TrustRank in demoting spam sites or pages. Compared to TrustRank, our best technique can decrease spam from the top ranked sites by as much as 43.1%.
机译:网络垃圾邮件是试图欺骗搜索引擎排名算法的行为。 TrustRank是一种可以抵制Web垃圾邮件的最新算法。但是,从某种意义上说,TrustRank容易受到攻击,因为TrustRank使用的种子集可能不足以代表Web上的不同主题。同样,对于给定的种子集,TrustRank对更大的社区有偏见。我们建议使用主题信息对种子集进行分区,并分别计算每个主题的信任度,以解决上述问题。页面的这些信任度分数的组合用于确定其排名。在两个大型数据集上的实验结果表明,我们的主题TrustRank在降级垃圾邮件站点或网页方面的性能优于TrustRank。与TrustRank相比,我们的最佳技术可以将排名最高的站点的垃圾邮件减少多达43.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号