首页> 外文会议>Workshop on Arabic Natural Language Processing >Arabic Compact Language Modelling for Resource Limited Devices
【24h】

Arabic Compact Language Modelling for Resource Limited Devices

机译:资源有限公司的阿拉伯语紧凑语言建模

获取原文

摘要

Natural language modelling has gained a lot of interest recently. The current state-of-the-art results are achieved by first training a very large language model and then fine-tuning it on multiple tasks. However, there is little work on smaller more compact language models for resource-limited devices or applications. Not to mention, how to efficiently train such models for a low-resource language like Arabic. In this paper, we investigate how such models can be trained in a compact way for Arabic. We also show how distillation and quantization can be applied to create even smaller models. Our experiments show that our largest model which is 2x smaller than the baseline can achieve better results on multiple tasks with 2x less data for pretraining.
机译:自然语言建模最近获得了很多兴趣。 目前最先进的结果是通过首先培训非常大的语言模型来实现,然后在多个任务上进行微调。 但是,对于资源限制设备或应用程序,对更小的更紧凑的语言模型几乎没有工作。 更不用说,如何为阿拉伯语等低资源语言有效地培训这种模型。 在本文中,我们调查如何以紧凑的方式培训这种模型。 我们还展示了如何应用蒸馏和量化来产生较小的模型。 我们的实验表明,我们的最大模型比基线小于基线,可以在多个任务中获得更好的结果,其中2倍的预制数据更少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号