首页> 外文会议>CIKM 10;ACM conference on information and knowledge management >Elusive Vandalism Detection in Wikipedia: A Text Stability-based Approach
【24h】

Elusive Vandalism Detection in Wikipedia: A Text Stability-based Approach

机译:维基百科中难以捉摸的故意破坏检测:基于文本稳定性的方法

获取原文

摘要

The open collaborative nature of wikis encourages participation of all users, but at the same time exposes their content to vandalism. The current vandalism-detection techniques, while effective against relatively obvious vandalism edits, prove to be inadequate in detecting increasingly prevalent sophisticated (or elusive) vandal edits. We identify a number of vandal edits that can take hours, even days, to correct and propose a text stability-based approach for detecting them. Our approach is focused on the likelihood of a certain part of an article being modified by a regular edit. In addition to text-stability, our machine learning-based technique also takes into account edit patterns. We evaluate the performance of our approach on a corpus comprising of 15000 manually labeled edits from the Wikipedia Vandalism PAN corpus. The experimental results show that text-stability is able to improve the performance of the selected machine-learning algorithms significantly.
机译:Wiki的开放式协作性质鼓励所有用户参与,但同时也使他们的内容遭受破坏。当前的恶意破坏检测技术,虽然对付相对明显的恶意破坏编辑有效,但不足以检测日益普遍的复杂(或难以捉摸)的恶意破坏编辑。我们确定了一些可能需要数小时甚至数天才能进行的破坏性编辑,以进行更正,并提出了一种基于文本稳定性的方法来对其进行检测。我们的方法侧重于通过常规编辑修改文章某个部分的可能性。除了文本稳定之外,我们基于机器学习的技术还考虑了编辑模式。我们评估了我们的方法在包括Wikipedia Vandalism PAN语料库的15000个手动标记的编辑的语料库中的性能。实验结果表明,文本稳定性能够显着提高所选机器学习算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号