首页> 外文会议>International Workshop on Semantic Evaluation >NUIG at SemEval-2020 Task 12: Pseudo labelling for offensive content classification

【24h】

NUIG at SemEval-2020 Task 12: Pseudo labelling for offensive content classification

机译：Nuig在Semeval-2020任务12：伪标签用于进攻内容分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work addresses the classification problem defined by sub-task A (English only) of the OffensEval 2020 challenge. We used a semi-supervised approach to classify given tweets into an offensive (OFF) or not-offensive (NOT) class. As the OffensEval 2020 dataset is loosely labelled with confidence scores given by unsupervised models, we used last year's offensive language identification dataset (OLID) to label the OffensEval 2020 dataset. Our approach uses a pseudo-labelling method to annotate the current dataset. We trained four text classifiers on the OLID dataset and the classifier with the highest macro-averaged F1-score has been used to pseudo label the OffensEval 2020 dataset. The same model which performed best amongst four text classifiers on OLID dataset has been trained on the combined dataset of OLID and pseudo labelled OffensEval 2020. We evaluated the classifiers with precision, recall and macro-averaged F1-score as the primary evaluation metric on the OLID and OffensEval 2020 datasets.

机译：这项工作解决了offenseVal 2020挑战的子任务A（仅英文）定义的分类问题。我们使用了一个半监督方法来分类给赋予推文，进入令人反感（OFF）或不攻击（不是）课程。由于Iffenseval 2020数据集是由无监督模型给出的置信度分数松散地标记，我们使用了去年的攻击性语言识别数据集（OLID）来标记offenseVal 2020数据集。我们的方法使用伪标记方法来注释当前数据集。我们在OLID数据集上训练了四个文本分类器，并且具有最高宏平均f1分数的分类器已被用于伪标记offenseVal 2020数据集。在OlID数据集上的四个文本分类器中最佳的相同模型已经在Olid和Pseudo标记的offenseVal 2020的组合数据集上培训。我们用精度，召回和宏观平均f1-score评估了分类器作为主要评估度量Olid和Offenseval 2020数据集。

著录项

来源
《International Workshop on Semantic Evaluation 》|2020年|1598-1604|共7页
会议地点
作者
Shardul Suryawanshi; Mihael Arcan; Paul Buitelaar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A novel multi-task TSK fuzzy classifier and its enhanced version for labeling-risk-aware multi-task classification [J] . Jiang Yizhang, Deng Zhaohong, Choi Kup-Sze, Information Sciences: An International Journal . 2016 ,第Null期

机译：一种新颖的多任务TSK模糊分类器及其增强版本，可感知标记风险的多任务分类
2. Optimization and reliability of multiple postlabeling delay pseudo-continuous arterial spin labeling during rest and stimulus-induced functional task activation [J] . Mezue Melvin, Segerdahl Andrew R., Okell Thomas W., Journal of Cerebral Blood Flow and Metabolism: Official Journal of the International Society of Cerebral Blood Flow and Metabolism . 2014 ,第12期

机译：休息和刺激引起的功能性任务激活过程中多个标记后延迟伪连续动脉自旋标记的优化和可靠性
3. Optimization and reliability of multiple postlabeling delay pseudo-continuous arterial spin labeling during rest and stimulus-induced functional task activation [J] . Mezue Melvin, Segerdahl Andrew R., Okell Thomas W., Journal of Cerebral Blood Flow and Metabolism: Official Journal of the International Society of Cerebral Blood Flow and Metabolism . 2014 ,第12期

机译：休息和刺激诱导功能任务激活过程中多重后延迟伪连续动脉旋转标记的优化和可靠性
4. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels using Statistical Sampling and Post-Processing [C] . Manikandan Ravikiran, Amin Ekant Muljibhai, Toshinori Miyoshi, International Workshop on Semantic Evaluation . 2020

机译：Hitachi在Semeval-2020任务12：使用统计采样和后处理的冒险标签攻击语言识别
5. Discriminating Pseudo-nitzschia (Bacillariophyceae) species using fluorescently labeled LSU rRNA-targeted DNA probes, and searching for bacteria or plasmids associated with Pseudo-nitzschia australis. [D] . Miller, Peter Eugene. 1999

机译：使用荧光标记的LSU rRNA靶向的DNA探针区分假性假单胞菌（Bacillariophyceae）物种，并搜索与南方假性假单胞菌相关的细菌或质粒。
6. Optimization and reliability of multiple postlabeling delay pseudo-continuous arterial spin labeling during rest and stimulus-induced functional task activation [O] . Melvin Mezue, Andrew R Segerdahl, Thomas W Okell, 2014

机译：休息和刺激引起的功能性任务激活过程中多个标记后延迟的伪连续动脉自旋标记的优化和可靠性
7. LTL-UDE at SemEval-2019 Task 6: BERT and Two-Vote Classification for Categorizing Offensiveness [O] . Piush Aggarwal, Tobias Horsmann, Michael Wojatzki, 2019

机译：Semeval-2019任务6：伯特和双票分类，用于分类冒险

NUIG at SemEval-2020 Task 12: Pseudo labelling for offensive content classification

摘要

著录项

相似文献

相关主题

期刊订阅