首页> 外文会议>International conference on trust and trustworthy computing >LookAhead: Augmenting Crowdsourced Website Reputation Systems with Predictive Modeling
【24h】

LookAhead: Augmenting Crowdsourced Website Reputation Systems with Predictive Modeling

机译:前瞻:通过预测建模增强众包网站信誉系统

获取原文

摘要

Unsafe websites consist of malicious as well as inappropriate sites, such as those hosting questionable or offensive content. Website reputation systems are intended to help ordinary users steer away from these unsafe sites. However, the process of assigning safety ratings for websites typically involves humans. Consequently it is time consuming, costly and not scalable. This has resulted in two major problems: (ⅰ) a significant proportion of the web space remains unrated and (ⅱ) there is an unacceptable time lag before new websites are rated. In this paper, we show that by leveraging structural and content-based properties of websites, we can reliably and efficiently predict their safety ratings, thereby mitigating both problems. We demonstrate the effectiveness of our approach using four datasets of up to 90,000 websites. We use ratings from Web of Trust (WOT), a popular crowdsourced web reputation system, as ground truth. We propose a novel ensemble classification technique that makes opportunistic use of available structural and content properties of web pages to predict their eventual ratings in two dimensions used by WOT: trustworthiness and child safety. Ours is the first classification system to predict such subjective ratings. The same approach works equally well in identifying malicious websites. Across all datasets, our classification achieves average F_1-score in the 74-90% range.
机译:不安全的网站包括恶意和不适当的网站,例如那些托管有问题或令人反感的内容的网站。网站信誉系统旨在帮助普通用户远离这些不安全的站点。但是,为网站分配安全等级的过程通常涉及人员。因此,这是费时,昂贵且不可扩展的。这导致了两个主要问题:(ⅰ)很大一部分网站空间保持未评级,并且(ⅱ)在对新网站进行评级之前存在不可接受的时间间隔。在本文中,我们表明,通过利用网站的结构和基于内容的属性,我们可以可靠,有效地预测其安全等级,从而缓解这两个问题。我们使用多达90,000个网站的四个数据集证明了我们方法的有效性。我们使用来自流行的众包Web信誉系统Web of Trust(WOT)的评级作为事实。我们提出了一种新颖的集成分类技术,该技术可以利用机会利用网页的可用结构和内容属性来预测WOT使用的两个维度的最终评级:可信赖性和儿童安全。我们是第一个预测此类主观评分的分类系统。同样的方法在识别恶意网站方面同样有效。在所有数据集中,我们的分类均在74-90%的范围内获得了平均F_1得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号