Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification

机译：维基百科文章的质量：分析特点，建立受监督分类的基础事实

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Wikipedia is nowadays one of the biggest online resources on which users rely as a source of information. The amount of collaboratively generated content that is sent to the online encyclopedia every day can let to the possible creation of low-quality articles (and, consequently, misinformation) if not properly monitored and revised. For this reason, in this paper, the problem of automatically assessing the quality of Wikipedia articles is considered. In particular, the focus is (i) on the analysis of groups of hand-crafted features that can be employed by supervised machine learning techniques to classify Wikipedia articles on qualitative bases, and (ii) on the analysis of some issues behind the construction of a suitable ground truth. Evaluations are performed, on the analyzed features and on a specifically built labeled dataset, by implementing different supervised classifiers based on distinct machine learning algorithms, which produced promising results.

机译：Wikipedia现在是用户依赖信息来源的最大在线资源之一。如果未正确监测和修订，每天发送到在线百科全书的协同生成内容的数量可以让您创造出低质量的文章（以及，因此，错误信息）。出于这个原因，考虑了自动评估维基百科文章的质量的问题。特别地，重点是（i）对可以通过监督机器学习技术采用的手工制作特征组的分析，以对定性基地进行分类的维基百科文章，并在分析建设后面的一些问题一个合适的理论。通过基于不同的机器学习算法实现不同的监督分类器，在分析的特征和专门建立标有数据集上进行评估。

著录项

来源
《International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management》|2019年|1(CD-ROM)|共9页
会议地点
作者
Elias Bassani; Marco Viviani;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G354-53;
关键词
Data quality; Wikipedia; Supervised classification; Feature analysis; Ground truth building;

机译：数据质量;维基百科;监督分类;特征分析;地面真理建设;

相似文献

外文文献
中文文献
专利

1. Supervised Learning Based System for Classification of Wikipedia Articles [J] . Raghavendra S., Lingaraju G. M., Shekar sivasubramanian International Journal of Applied Engineering Research . 2018,第15aPta5期

机译：维基百科文章分类基于学习的系统
2. A hybrid approach to classifying Wikipedia article quality flaws with feature fusion framework [J] . Wang Ping, Li Muyan, Li Xiaodan, Expert systems with applications . 2021,第Nova期

机译：将维基百科文章质量缺陷进行分类的混合方法，具有功能融合框架
3. Evaluating quality control of Wikipedia's feature articles [J] . David Lindsey First Monday . 2010,第4期

机译：评估Wikipedia专题文章的质量控制
4. Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification [C] . Elias Bassani, Marco Viviani International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management . 2019

机译：维基百科文章的质量：分析特点，建立受监督分类的基础事实
5. Feature Extraction and Fusion for Supervised and Semi-supervised Classification: Application to fMRI and LTM Data. [D] . Du, Wei. 2014

机译：监督和半监督分类的特征提取和融合：应用于fMRI和LTM数据。
6. Pharmacy students can improve access to quality medicines information by editing Wikipedia articles [O] . Dorie E. Apollonio, Keren Broyde, Amin Azzam, 2018

机译：药学专业的学生可以通过编辑Wikipedia文章来改善对优质药物信息的访问
7. Improving the potential of pixel-based supervised classification in the absence of quality ground truth data [O] . Pretorius, Erika, Pretorius, Rudi 2015

机译：在没有高质量的地面实况数据的情况下，提高基于像素的监督分类的潜力

Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification

摘要

著录项

相似文献

相关主题

期刊订阅