首页> 外文会议>IEEE International Conference on Big Data and Smart Computing >History-Based Article Quality Assessment on Wikipedia
【24h】

History-Based Article Quality Assessment on Wikipedia

机译:基于历史的维基百科文章质量评估

获取原文

摘要

Wikipedia is widely considered as the biggest encyclopedia on Internet. Quality assessment of articles on Wikipedia has been studied for years. Conventional methods addressed this task by feature engineering and statistical machine learning algorithms. However, manually defined features are difficult to represent the long edit history of an article. Recently, researchers proposed an end-to-end neural model which used a Recurrent Neural Network(RNN) to learn the representation automatically. Although RNN showed its power in modeling edit history, the end-to-end method is time and resource consuming. In this paper, we propose a new history-based method to represent an article. We also take advantage of an RNN to handle the long edit history, but we do not abandon feature engineering. We still represent each revision of an article by manually defined features. This combination of deep neural model and feature engineering enables our model to be both simple and effective. Experiments demonstrate our model has better or comparable performance than previous works, and has the potential to work as a real-time service. Plus, we extend our model to do quality prediction.
机译:Wikipedia被广泛认为是Internet上最大的百科全书。维基百科上文章的质量评估已经研究了多年。常规方法通过特征工程和统计机器学习算法解决了这一任务。但是,手动定义的功能很难代表文章的较长编辑历史。最近,研究人员提出了一种端到端神经模型,该模型使用递归神经网络(RNN)自动学习表示。尽管RNN在建模编辑历史中显示出了强大的功能,但是端到端的方法却很耗时间和资源。在本文中,我们提出了一种新的基于历史的方法来表示文章。我们还利用RNN来处理较长的编辑历史记录,但是我们不放弃要素工程。我们仍然通过手动定义的功能来表示文章的每个修订版。深度神经模型和特征工程的结合使我们的模型既简单又有效。实验表明,我们的模型比以前的模型具有更好的性能或可比的性能,并且具有作为实时服务工作的潜力。另外,我们扩展了模型以进行质量预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号