Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics

机译：在新浪微博评估删除后删除：热门话题的多模态分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Widespread Chinese social media applications such as Weibo are widely known for monitoring and deleting posts to conform to Chinese government requirements. In this paper, we focus on analyzing a dataset of censored and uncensored posts in Weibo. Despite previous work that only considers text content of posts, we take a multi-modal approach that takes into account both text and image content. We categorize this dataset into 14 categories that have the potential to be censored on Weibo, and seek to quantify censorship by topic. Specifically, we investigate how different factors interact to affect censorship. We also investigate how consistently and how quickly different topics are censored. To this end, we have assembled an image dataset with 18,966 images, as well as a text dataset with 994 posts from 14 categories. We then utilized deep learning, CNN localization, and NLP techniques to analyze the target dataset and extract categories, for further analysis to better understand censorship mechanisms in Weibo. We found that sentiment is the only indicator of censorship that is consistent across the variety of topics we identified. Our finding matches with recently leaked logs from Sina Weibo. We also discovered that most categories like those related to anti-government actions (e.g. protest) or categories related to politicians (e.g. Xi Jinping) are often censored, whereas some categories such as crisis-related categories (e.g. rainstorm) are less frequently censored. We also found that censored posts across all categories are deleted in three hours on average.

机译：广泛的中国社交媒体应用，如微博，广为人知，监测和删除职位以符合中国政府要求。在本文中，我们专注于分析微博中被审查和未经审查的帖子的数据集。尽管以前的工作仅考虑了帖子的文本内容，但我们采取了一种多模态方法，该方法考虑了文本和图像内容。我们将此数据集分类为14个类别，这些类别可能会在微博上进行审查，并寻求按主题量化审查。具体而言，我们调查不同因素互动如何影响审查。我们还调查了多么持续，多么迅速被审查。为此，我们组装了一个带18,966个图像的图像数据集，以及来自14个类别的994个帖子的文本数据集。然后，我们利用了深度学习，CNN本地化和NLP技术来分析目标数据集和提取类别，以便进一步分析，以更好地了解微博中的审查机制。我们发现情绪是审查的唯一指标，这些指标在我们确定的各种主题中一致。我们的查找匹配与新浪微博的最近泄露的日志。我们还发现，大多数类别，如与反政府行动（例如抗议）或与政治家有关的类别（例如Xi Jinping）相关的类别经常被审查，而某些类别如危机相关类别（例如Rainstorm）则较不常被审查。我们还发现，所有类别的审查帖子平均删除三个小时。

著录项

来源
《Workshop on natural language processing for internet freedom》|2019年|x 170 p.|共9页
会议地点
作者
Meisam Navaki Arefi; Rajkumar Pandi; Jedidiah R. Crandall; Michael Carl Tschantz; King-wa Fu; Dahlia Qiu Shi; Miao Sha;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. What is Discussed about COVID-19: A Multi-Modal Framework for Analyzing Microblogs from Sina Weibo without Human Labeling [J] . Hengyang Lu, Yutong Lou, Bin Jin, Computers, Materials & Continua . 2020,第3期

机译：关于Covid-19的讨论：一种多模态框架，用于分析来自新浪微博的微博没有人类标签
2. Topic evolution analysis in social networking services: Taking Sina Weibo as an example [J] . Wang Yuhui International Journal of Computer Systems Science & Engineering . 2018,第4期

机译：社交网络服务中的主题演化分析：以新浪微博为例
3. A short-term trend prediction model of topic over Sina Weibo dataset [J] . Juanjuan Zhao, Weili Wu, Xiaolong Zhang, Journal of combinatorial optimization . 2014,第3期

机译：新浪微博数据集主题的短期趋势预测模型
4. Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics [C] . Meisam Navaki Arefi, Rajkumar Pandi, Jedidiah R. Crandall, Workshop on natural language processing for internet freedom . 2019

机译：在新浪微博评估删除后删除：热门话题的多模态分类
5. Inter-media study in China's media sphere: A content analysis of information on Sina Weibo and Sina news [D] . Wang, Xueting 2013

机译：中国媒体领域的跨媒体研究：新浪微博和新浪新闻信息的内容分析
6. An Analysis of Anxiety-Related Postings on Sina Weibo [O] . Xianyun Tian, Fang He, Philip Batterham, 2017

机译：新浪微博上与焦虑相关的帖子分析
7. Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics [O] . Meisam Navaki Arefi, Rajkumar Pandi, Michael Carl Tschantz, 2019

机译：在新浪微博评估删除后删除：多模态分类热门话题

Assessing Post Deletion in Sina Weibo: Multi-modal Classification of Hot Topics

摘要

著录项

相似文献

相关主题

期刊订阅