首页> 外文OA文献 >“Got You!”: Automatic Vandalism Detection in Wikipedia with Web-based Shallow Syntactic-Semantic Modeling
【2h】

“Got You!”: Automatic Vandalism Detection in Wikipedia with Web-based Shallow Syntactic-Semantic Modeling

机译:“知道了!”:Wikipedia中基于网络的浅句法语义建模的自动破坏行为检测

摘要

Discriminating vandalism edits from non-vandalism edits in Wikipedia is a challenging task, as ill-intentioned edits can include a variety of content and be expressed in many different forms and styles. Previous studies are limited to rule-based methods and learning based on lexical features, lacking in linguistic analysis. In this paper, we propose a novel Web-based shallow syntacticsemantic modeling method, which utilizes Web search results as resource and trains topic-specific n-tag and syntactic n-gram language models to detect vandalism. By combining basic task-specific and lexical features, we have achieved high F-measures using logistic boosting and logistic model trees classifiers, surpassing the results reported by major Wikipedia vandalism detection systems.
机译:区分维基百科中的非故意破坏性编辑与非故意破坏性编辑是一项艰巨的任务,因为恶意编辑可能包含多种内容,并以多种不同形式和样式表示。以前的研究仅限于基于规则的方法和基于词汇特征的学习,缺乏语言分析能力。在本文中,我们提出了一种新颖的基于Web的浅层句法语义建模方法,该方法利用Web搜索结果作为资源,并训练特定于主题的n-tag和句法n-gram语言模型来检测故意破坏行为。通过结合特定于任务的基本功能和词汇功能,我们使用后勤增强和后勤模型树分类器实现了较高的F度量,超过了主要的Wikipedia故意破坏系统报告的结果。

著录项

  • 作者

    McKeown Kathleen; Wang William;

  • 作者单位
  • 年度 2010
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号