Cross Language Prediction of Vandalism on Wikipedia Using Article Views and Revisions

机译：使用文章视图和修订版对维基百科上的故意破坏行为进行跨语言预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Vandalism is a major issue on Wikipedia, accounting for about 2% (350,000+) of edits in the first 5 months of 2012. The majority of vandalism are caused by humans, who can leave traces of their malicious behaviour through access and edit logs. We propose detecting vandalism using a range of classifiers in a monolingual setting, and evaluated their performance when using them across languages on two data sets: the relatively unexplored hourly count of views of each Wikipedia article, and the commonly used edit history of articles. Within the same language (English and German), these classifiers achieve up to 87% precision, 87% recall, and F1-score of 87%. Applying these classifiers across languages achieve similarly high results of up to 83% precision, recall, and Fl-score. These results show characteristic vandal traits can be learned from view and edit patterns, and models built in one language can be applied to other languages.

机译：恶意破坏是Wikipedia上的一个主要问题，在2012年的前5个月中，约有2％（350,000+）次编辑。故意破坏是由人类引起的，他们可以通过访问和编辑日志来留下其恶意行为的痕迹。我们建议在单一语言环境中使用一系列分类器来检测破坏行为，并在两种数据集上跨语言使用它们时评估它们的性能：每篇Wikipedia文章的相对未开发的每小时观看次数统计，以及文章的常用编辑历史记录。在相同的语言（英语和德语）中，这些分类器可实现高达87％的精度，87％的召回率和87％的F1分数。在各种语言中应用这些分类器，可以达到类似的高结果，其准确率，召回率和Fl得分高达83％。这些结果表明，可以通过查看和编辑模式来学习特征性破坏特征，并且以一种语言构建的模型可以应用于其他语言。

著录项

来源
《Pacific-Asia conference on knowledge discovery and data mining》|2013年|268-279|共12页
会议地点
作者
Khoi-Nguyen Tran; Peter Christen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia [J] . Tran K., Christen P. Knowledge and Data Engineering, IEEE Transactions on . 2015,第3期

机译：从机器人和用户进行跨语言学习，以检测维基百科上的故意破坏行为
2. The Class Imbalance Problem in the Machine Learning Based Detection of Vandalism in Wikipedia across Languages [J] . Arsim Susuri, Mentor Hamiti Agni Dika Advances in Science, Technology and Engineering Systems . 2017,第1期

机译：基于机器学习的跨语言维基百科中故意破坏的检测中的类不平衡问题
3. The dynamics of Wikipedia article revisions: an analysis of revision activities and patterns [J] . Zhongming Ma, Jie Tao, Jing Hu International journal of data mining, modelling and management . 2017,第4期

机译：Wikipedia文章修订的动态：修订活动和模式的分析
4. Cross Language Prediction of Vandalism on Wikipedia Using Article Views and Revisions [C] . Khoi-Nguyen Tran, Peter Christen PAKDD 2013 . 2013

机译：使用文章观点和修订的维基百科对维克尼型破坏的跨语言预测
5. How Wikipedia Editors Collaborate on Article 'Talk' Pages [D] . Magnuson, Victor. 2018

机译：Wikipedia编辑如何在文章“对话”页面上进行协作
6. Why Medical Schools Should Embrace Wikipedia: Final-Year Medical Student Contributions to Wikipedia Articles for Academic Credit at One School [O] . Amin Azzam, David Bresler, Armando Leon, -1

机译：医学院为何应采用Wikipedia：在一所学校为获得学分而对Wikipedia文章进行的最后一年的医学学生贡献
7. The Class Imbalance Problem in the Machine Learning Based Detection of Vandalism in Wikipedia across Languages [O] . Arsim Susuri, Mentor Hamiti, Agni Dika 2017

机译：基于机器学习的班级不平衡问题，跨语言维基百科的破坏者
8. Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. [R] . West, A. G. 2010

机译：通过修订元数据的时空分析检测维基百科的破坏行为。

Cross Language Prediction of Vandalism on Wikipedia Using Article Views and Revisions

摘要

著录项

相似文献

相关主题

期刊订阅