首页> 外文会议>International Workshop on Semantic Evaluation >Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language
【24h】

Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language

机译:Semeval-2020的团队凿鲁贝任务12:交叉舌诱导转移以检测冒犯性语言

获取原文

摘要

With the growing use of social media and its availability, many instances of the use of offensive language have been observed across multiple languages and domains. This phenomenon has given rise to the growing need to detect the offensive language used in social media cross-lingually. In OffensEval 2020, the organizers have released the multilingual Offensive Language Identification Dataset (mOLID), which contains tweets in five different languages, to detect offensive language. In this work, we introduce a cross-lingual inductive approach to identify the offensive language in tweets using the contextual word embedding XLM-RoBERTa (XLM-R). We show that our model performs competitively on all five languages, obtaining the fourth position in the English task with an F1-score of 0.919 and eighth position in the Turkish task with an F1-score of 0.781. Further experimentation proves that our model works competitively in a zero-shot learning environment, and is extensible to other languages.
机译:随着社交媒体的使用越来越多,已经在多种语言和域中观察到许多使用攻击性语言的情况。这种现象使得越来越需要越来越多地检测社交媒体上使用的令人反感的语言。在Iffenseval 2020中,组织者发布了多语言攻击语言识别数据集(Molid),其中包含五种不同语言的推文,以检测令人反感的语言。在这项工作中,我们使用嵌入XLM-Roberta(XLM-R)来介绍跨语言的归纳方法来识别推文中的攻击性语言。我们展示我们的模型对所有五种语言表现得很有竞争力,在土耳其任务中获得英语任务中的第四个职位,F1分数为0.919和第八位,F1分数为0.781。进一步的实验证明,我们的模型在零射击学习环境中竞争地相同工作,并且是可扩展的其他语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号