首页> 外文期刊>Information Processing & Management >Deep contextualized text representation and learning for fake news detection
【24h】

Deep contextualized text representation and learning for fake news detection

机译:虚拟新闻检测的深层语境化文本表示和学习

获取原文
获取原文并翻译 | 示例
       

摘要

In recent years, due to the widespread use of social media and broadcasting agencies around the world, people are extremely exposed to being affected by false information and fake news, all of which have negative impacts on both collective thoughts and governments' policies. In recent years, the great success of pre-trained models for embedding contextual information from texts motivates researchers to utilize these embeddings in different natural language processing tasks. However, in a complex task like fake news detection, it is not determined which contextualized embedding can assist the classifier with more valuable features. Due to the lack of a comparative study about utilizing different contextualized pre-trained models besides distinct neural classifiers, we aim to dive into a comparative study about using different classifiers and embedding models. In this paper, we propose three classifiers with different pre-trained models for embedding input news articles. We connect Single-Layer Perceptron (SLP), Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN) after the embedding layer which consists of novel pre-trained models such as BERT, RoBERTa, GPT2, and Funnel Transformer in order to benefit from deep contextualized representation provided by those models as well as deep neural classifications. We evaluate our proposed models on three well-known fake news datasets: LIAR (Wang, 2017), ISOT (Ahmed et al., 2017), and COVID-19 Patwa et al. (2020). The results on these three datasets show the superiority of our proposed models for fake news detection compared to the state-of-the-art models. The results show 7% and 0.1% improvements in classification accuracy compared to the proposed model by Goldani et al. (2021) on LIAR and ISOT, respectively. We also achieved 1% improvement compared to the proposed model by Shifath et al. (2021) on the COVID-19 dataset.
机译:近年来,由于世界各地的社交媒体和广播机构广泛使用,人们非常接触受虚假信息和假新闻的影响,所有这些都对集体思想和政府的政策产生负面影响。近年来,从文本中嵌入上下文信息的预先训练模型的巨大成功激励了研究人员利用不同的自然语言处理任务中的这些嵌入。然而,在像假新闻检测等复杂任务中,尚不确定哪个上下文化嵌入可以帮助分类器具有更有价值的功能。由于除了不同的神经分类机之外,缺乏关于利用不同的上下文化预先训练模型的比较研究,我们的目的旨在潜入关于使用不同分类器和嵌入模型的比较研究。在本文中,我们提出了三个分类器,具有不同的预先训练模型,用于嵌入输入新闻文章。在嵌入层之后,将单层Perceptron(SLP),多层Perceptron(MLP)和卷积神经网络(CNN)连接,该嵌入层由伯特,罗伯塔,GPT2和漏斗变压器等新型预训练型号组成从这些模型提供的深层语境化表示中受益以及深度神经分类。我们在三个着名的假新闻数据集中评估我们提出的模型:骗子(王,2017),Isot(Ahmed等,2017)和Covid-19 Patwa等人。 (2020)。与最先进的模型相比,这三个数据集的结果显示了我们所提出的假新闻检测模型的优势。与Goldani等人的建议的型号相比,结果显示了7%和0.1%的分类准确性提高。 (2021)分别在骗子和Isot。与Shifath等人的建议模型相比,我们还实现了1%的改进。 (2021)在Covid-19数据集上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号