首页> 外文会议>International Workshop on Semantic Evaluation >LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification

【24h】

LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification

机译：Semeval-2020任务12：一种用于多语言攻击性语言识别的跨语言增强方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents our system entitled 'LIIR' for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2). We have participated in Subtask A for English, Danish, Greek, Arabic, and Turkish languages. We adapt and fine-tune the BERT and multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages, we propose a cross-lingual augmentation approach in order to enrich training data and we use multilingual BERT to obtain sentence representations. LIIR achieved rank 14/38, 18/47, 24/86, 24/54, and 25/40 in Greek, Turkish, English, Arabic, and Danish languages, respectively.

机译：本文介绍了我们的系统，为Semeval-2020任务12题为“Liir”，就社交媒体的多语言攻击性语言识别（Offenseval 2）。我们参加了英语，丹麦语，希腊语，阿拉伯语和土耳其语语言的子任务。我们适应和微调谷歌AI提供的BERT和多语言BERT模型，分别用于英语和非英语。对于英语，我们使用两个微调伯特模型的组合。对于其他语言，我们提出了一种交叉语言增强方法，以丰富培训数据，我们使用多语种伯格来获得句子表示。李尔分别取得了希腊，土耳其语，英语，阿拉伯语和丹麦语语言的14/38,18 / 47,24 / 86,24 / 47,24 / 80，24/54和25/40。

著录项

来源
《International Workshop on Semantic Evaluation》|2020年|2073-2079|共7页
会议地点
作者
Erfan Ghadery; Marie-Francine Moens;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:58:11

相似文献

外文文献
中文文献
专利

1. Language identification for internet security in the basque context: A cross-lingual approach [J] . Barroso, Nora, de Ipina, IEEE Aerospace and Electronic Systems Magazine . 2013,第8aPart1期

机译：巴斯克语境中的互联网安全语言识别：跨语言方法
2. Investigating cross-lingual training for offensive language detection [J] . Andra? Pelicon, Ravi Shekhar, Bla? ?krlj, PeerJ Computer Science . 2021,第a期

机译：调查攻击性语言检测的交叉思考
3. Composing a narrative story in a third language: multilinguals' reliance on multiple languages in an L3 linguistic task [J] . Pap Emese Boksay International Journal of Bilingual Education and Bilingualism . 2016,第2期

机译：用第三种语言撰写一个叙事故事：在三级语言中，多语种对多种语言的依赖
4. TAC at SemEval-2020 Task 12: Ensembling Approach for Multilingual Offensive Language Identification in Social Media [C] . Talha Anwar, Mirza Omer Beg International Workshop on Semantic Evaluation . 2020

机译：Semeval-2020的TAC任务12：社交媒体中多语种攻击语言识别的合奏方法
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Fostering Multilinguality in the UMLS: A Computational Approach to Terminology Expansion for Multiple Languages [O] . Johannes Hellrich, Udo Hahn 2014

机译：在UMLS中促进多语种：一种多语言术语扩展的计算方法
7. Multilingual Offensive Language Identification with Cross-lingual Embeddings [O] . Tharindu Ranasinghe, Marcos Zampieri 2020

机译：跨舌嵌入的多语言攻击语言识别

LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification

摘要

著录项

相似文献

相关主题

期刊订阅