Arabic Compact Language Modelling for Resource Limited Devices

机译：资源有限公司的阿拉伯语紧凑语言建模

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Natural language modelling has gained a lot of interest recently. The current state-of-the-art results are achieved by first training a very large language model and then fine-tuning it on multiple tasks. However, there is little work on smaller more compact language models for resource-limited devices or applications. Not to mention, how to efficiently train such models for a low-resource language like Arabic. In this paper, we investigate how such models can be trained in a compact way for Arabic. We also show how distillation and quantization can be applied to create even smaller models. Our experiments show that our largest model which is 2x smaller than the baseline can achieve better results on multiple tasks with 2x less data for pretraining.

机译：自然语言建模最近获得了很多兴趣。目前最先进的结果是通过首先培训非常大的语言模型来实现，然后在多个任务上进行微调。但是，对于资源限制设备或应用程序，对更小的更紧凑的语言模型几乎没有工作。更不用说，如何为阿拉伯语等低资源语言有效地培训这种模型。在本文中，我们调查如何以紧凑的方式培训这种模型。我们还展示了如何应用蒸馏和量化来产生较小的模型。我们的实验表明，我们的最大模型比基线小于基线，可以在多个任务中获得更好的结果，其中2倍的预制数据更少。

著录项

来源
《Workshop on Arabic Natural Language Processing》|2021年|53-59|共7页
会议地点
作者
Zaid Alyafeai; Irfan Ahmad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:58:14

相似文献

外文文献
中文文献
专利

1. Creating language resources for under-resourced languages:n methodologies, and experiments with Arabic [J] . El-Haj Mahmoud, Kruschwitz Udo, Fox Chris Language Resources and Evaluation . 2015,第3期

机译：为资源匮乏的语言创建语言资源：n方法和阿拉伯语实验
2. Natural language-based user interface for mobile devices with limited resources [J] . So-Young Park, Jeunghyun Byun, Hae-Chang Rim, Consumer Electronics, IEEE Transactions on . 2010,第4期

机译：资源有限的移动设备的基于自然语言的用户界面
3. AROMA: A Recursive Deep Learning Model for Opinion Mining in Arabic as a Low Resource Language [J] . Al-Sallab Ahmad, Baly Ramy, Hajj Hazem, ACM transactions on Asian language information processing . 2017,第4期

机译：AROMA：用于阿拉伯语作为低资源语言的意见挖掘的递归深度学习模型
4. A Compact Arabic Lexical Semantics Language Resource Based on the Theory of Semantic Fields [C] . Mohamed Attia, Mohsen Rashwan, Ahmed Ragheb, Advances in Natural Language Processing . 2008

机译：基于语义场理论的紧凑型阿拉伯语词汇语义语言资源
5. Conservation of Limited Resources: Design Principles for Security and Usability on Mobile Devices [D] . Horcher, Ann-Marie. 2018

机译：节约有限资源：移动设备安全性和可用性的设计原则
6. Enhancing African low-resource languages: Swahili data for language modelling [O] . Casper S. Shikali, Refuoe Mokhosi 2020

机译：增强非洲低资源语言：语言建模的斯瓦希里语数据
7. MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices [O] . Zhiqing Sun, Hongkun Yu, Xiaodan Song, 2020

机译：MobileBert：资源限制设备的紧凑任务 - 不可行的伯特

Arabic Compact Language Modelling for Resource Limited Devices

摘要

著录项

相似文献

相关主题

期刊订阅