The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

机译：在阿拉伯语预先接受的语言模型中的变体，大小和任务类型的相互作用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we explore the effects of language variants, data sizes, and fine-tuning task types in Arabic pre-trained language models. To do so, we build three pre-trained language models across three variants of Arabic: Modern Standard Arabic (MSA), dialectal Arabic, and classical Arabic, in addition to a fourth language model which is pre-trained on a mix of the three. We also examine the importance of pre-training data size by building additional models that are pre-trained on a scaled-down set of the MSA variant. We compare our different models to each other, as well as to eight publicly available models by fine-tuning them on five NLP tasks spanning 12 datasets. Our results suggest that the variant proximity of pre-training data to fine-tuning data is more important than the pre-training data size. We exploit this insight in defining an optimized system selection model for the studied tasks.

机译：在本文中，我们探讨了语言变体，数据大小和微调任务类型在阿拉伯语预先培训的语言模型中的影响。为此，我们在阿拉伯语的三种变体中建立了三种预先接受的语言模型：现代标准阿拉伯语（MSA），辩证阿拉伯语和古典阿拉伯语，除了第四种语言模型，该模型还在三个中预先培训。我们还通过构建在MSA变体的缩小组上预先培训的其他模型来检查预培训数据大小的重要性。我们将我们的不同型号互相进行比较，以及通过在跨越12个数据集的五个NLP任务上进行微调，以及八个公开的型号。我们的结果表明，预训练数据到微调数据的变体附近比预训练数据大小更重要。我们利用此识别在定义所研究的任务的优化系统选择模型方面。

著录项

来源
《Workshop on Arabic Natural Language Processing》|2021年|92-104|共13页
会议地点
作者
Go Inoue; Bashar Alhafni; Nurpeiis Baimukan; Houda Bouamor; Nizar Habash;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:58:11

相似文献

外文文献
中文文献
专利

1. Comparing pre-trained language models for Spanish hate speech detection [J] . Miriam Plaza-del-Arco Flor, Dolores Molina-Gonzalez M., Alfonso Urena-Lopez L., Expert systems with applications . 2021,第Mara期

机译：比较预先培训的语言模型，用于西班牙语仇恨语音检测
2. Injecting Event Knowledge into Pre-Trained Language Models for Event Extraction [J] . Zining Yang, Siyu Zhan, Mengshu Hou, Computer Science & Information Technology . 2020,第14期

机译：将事件知识注入预先培训的语言模型以进行事件提取
3. Event Nugget Detection using Pre-trained Language Models [J] . Riadh Meghatria, Chiraz Latiri, Fahima Nader Procedia Computer Science . 2020,第5期

机译：事件块使用预先培训的语言模型检测
4. IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize? [C] . Jaeyoul Shin, Taeuk Kim, Sang-goo Lee International Workshop on Semantic Evaluation . 2020

机译：Semeval-2020的IDS任务10：是否预先接受过的语言模型知道要强调的内容？
5. The relationship between Arabic language proficiency, English language proficiency, and science academic achievement of 11th grade Arabic speaking English language learners. [D] . Zamlut, Shadia. 2011

机译：11年级说阿拉伯语的英语学习者的阿拉伯语能力，英语能力和科学学术成就之间的关系。
6. Relation Extraction from Clinical Narratives Using Pre-trained Language Models [O] . Qiang Wei, Zongcheng Ji, Yuqi Si, 2019

机译：使用预训练的语言模型从临床叙事中提取关系
7. DATAMAFIA at WNUT-2020 Task 2: A Study of Pre-trained Language Models along with Regularization Techniques for Downstream Tasks [O] . Ayan Sengupta 2020

机译：DataMafia在Wnut-2020任务2：对预先训练的语言模型以及下游任务的正则化技术研究

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

摘要

著录项

相似文献

相关主题

期刊订阅