首页> 外文期刊>ACM transactions on the web >Propagating Both Trust and Distrust with Target Differentiation for Combating Link-Based Web Spam
【24h】

Propagating Both Trust and Distrust with Target Differentiation for Combating Link-Based Web Spam

机译:通过目标差异传播信任和不信任,以打击基于链接的Web垃圾邮件

获取原文
获取原文并翻译 | 示例

摘要

Semi-automatic anti-spam algorithms propagate either trust through links from a good seed set (e.g., TrustRank) or distrust through inverse links from a bad seed set (e.g., Anti-TrustRank) to the entire Web. These kinds of algorithms have shown their powers in combating link-based Web spam since they integrate both human judgement and machine intelligence. Nevertheless, there is still much space for improvement. One issue of most existing trust/distust propagation algorithms is that only trust or distrust is propagated and only a good seed set or a bad seed set is used. According to Wu et al. [2006a], a combined usage of both trust and distrust propagation can lead to better results, and an effective framework is needed to realize this insight. Another more serious issue of existing algorithms is that trust or distrust is propagated in nondifferential ways, that is, a page propagates its trust or distrust score uniformly to its neighbors, without considering whether each neighbor should be trusted or distrusted. Such kinds of blind propagating schemes are inconsistent with the original intention of trust/distrust propagation. However, it seems impossible to implement differential propagation if only trust or distrust is propagated. In this article, we take the view that each Web page has both a trustworthy side and an untrustworthy side, and we thusly assign two scores to each Web page: T-Rank, scoring the trustworthiness of the page, and D-Rank, scoring the untrustworthiness of the page. We then propose an integrated framework that propagates both trust and distrust. In the framework, the propagation of T-Rank/D-Rank is penalized by the target's current D-Rank/T-Rank. In other words, the propagation of T-Rank/D-Rank is decided by the target's current (generalized) probability of being trustworthy/untrustworthy; thus a page propagates more trust/distrust to a trustworthy/untrustworthy neighbor than to an untrustworthy/trustworthy neighbor. In this way, propagating both trust and distrust with target differentiation is implemented. We use T-Rank scores to realize spam demotion and D-Rank scores to accomplish spam detection. The proposed Trust-DistrustRank (TDR) algorithm regresses to TrustRank and Anti-TrustRank when the penalty factor is set to 1 and 0, respectively. Thus TDR could be seen as a combinatorial generalization of both TrustRank and Anti-TrustRank. TDR not only makes full use of both trust and distrust propagation, but also overcomes the disadvantages of both TrustRank and Anti-TrustRank. Experimental results on benchmark datasets show that TDR outperforms other semi-automatic anti-spam algorithms for both spam demotion and spam detection tasks under various criteria.
机译:半自动反垃圾邮件算法要么通过来自良好种子集(例如TrustRank)的链接传播信任,要么通过从不良种子集(例如Anti-TrustRank)到整个Web的反向链接传播信任。由于这些算法集成了人类判断力和机器智能,因此在对抗基于链接的Web垃圾邮件方面显示出了强大的功能。尽管如此,仍有很大的改进空间。大多数现有的信任/异议传播算法的一个问题是仅传播信任或不信任,并且仅使用良好的种子集或不良的种子集。据吴等。 [2006a],同时使用信任和不信任传播可以导致更好的结果,并且需要一个有效的框架来实现这一见解。现有算法的另一个更严重的问题是,信任或不信任以非差分方式传播,即页面将其信任或不信任分数均匀地传播给其邻居,而无需考虑每个邻居是否值得信任或不信任。这种类型的盲目传播方案与信任/不信任传播的初衷不符。但是,如果仅传播信任或不信任,则似乎无法实现差异传播。在本文中,我们认为每个网页都具有可信赖的一面和不可信赖的一面,因此我们为每个网页分配了两个分数:T-Rank(对页面的可信赖度评分)和D-Rank(对评分)页面的不信任度。然后,我们提出一个传播信任和不信任的集成框架。在该框架中,目标当前的D-Rank / T-Rank会损害T-Rank / D-Rank的传播。换句话说,T-Rank / D-Rank的传播取决于目标当前(广义)可信赖/不可信赖的概率;因此,与不信任/可信任邻居相比,页面向信任/不信任邻居传播更多的信任/不信任。以此方式,实现了以目标差异传播信任和不信任。我们使用T-Rank分数来实现垃圾邮件降级,并使用D-Rank分数来实现垃圾邮件检测。当惩罚因子分别设置为1和0时,建议的Trust-DistrustRank(TDR)算法将回归到TrustRank和Anti-TrustRank。因此,TDR可以看作TrustRank和Anti-TrustRank的组合概括。 TDR不仅充分利用了信任和不信任传播,而且还克服了TrustRank和Anti-TrustRank的缺点。在基准数据集上的实验结果表明,在各种条件下,TDR在垃圾邮件降级和垃圾邮件检测任务方面均优于其他半自动反垃圾邮件算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号