首页> 外文会议>Workshop on NLP for COVID-19 at ACL >COVID-19 and Arabic Twitter: How can Arab World Governments and Public Health Organizations Learn from Social Media?
【24h】

COVID-19 and Arabic Twitter: How can Arab World Governments and Public Health Organizations Learn from Social Media?

机译:Covid-19和阿拉伯语推特:阿拉伯世界各国政府和公共卫生组织如何从社交媒体中学到?

获取原文

摘要

In March 2020, the World Health Organization announced the COVID-19 outbreak as a pandemic. Most previous social media related research has been on English tweets and COVID-19. In this study, we collect approximately 1 million Arabic tweets from the Twitter streaming API related to COVID-19. Focussing on outcomes that we believe will be useful for Public Health Organizations, we analyse them in three different ways: identifying the topics discussed during the period, detecting rumours, and predicting the source of the tweets. We use the k-means algorithm for the first goal with k=5. The topics discussed can be grouped as follows: COVID-19 statistics, prayers for God, COVID-19 locations, advise and education for prevention, and advertising. We sample 2000 tweets and label them manually for false information, correct information, and unrelated. Then, we apply three different machine learning algorithms, Logistic Regression, Support Vector Classification, and Naive Bayes with two sets of features, word frequency approach and word em-beddings. We find that Machine Learning classifiers are able to correctly identify the rumour related tweets with 84% accuracy. We also try to predict the source of the rumour related tweets depending on our previous model which is about classifying tweets into five categories: academic, media, government, health professional, and public. Around (60%) of the rumour related tweets are classified as written by health professionals and academics.
机译:2020年3月,世界卫生组织宣布Covid-19作为大流行的疫情。最先前的社交媒体相关研究一直是英语推文和Covid-19。在这项研究中,我们从与Covid-19相关的Twitter流API收集大约100万阿拉伯语推文。关注我们认为对公共卫生组织有用的结果,我们以三种不同的方式分析它们:识别期间讨论的主题,检测谣言和预测推文的来源。我们使用K-means算法与k = 5的第一目标。讨论的主题可以分组如下:Covid-19统计,上帝,Covid-19的祈祷,预防和广告的建议和教育。我们示例2000推文并手动标记为虚假信息,正确信息和无关。然后,我们应用三种不同的机器学习算法,Logistic回归,支持向量分类,以及具有两组特征,词频方法和单词EM-BEDDINGS的天真贝叶斯。我们发现机器学习分类器能够正确地识别具有84%的准确度的谣言相关的推文。我们也试图预测根据我们以前的模式,即约鸣叫分成五类传闻相关的Twitter信息来源:学术,媒体,政府,卫生专业人员和公众。谣言相关推文的周围(60%)被归类为由健康专业人士和学者撰写的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号