首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers
【24h】

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

机译:使用乘数交替方向方法的递归神经网络语言模型的低位量化

获取原文

摘要

The high memory consumption and computational costs of Recurrent neural network language models (RNNLMs) limit their wider application on resource constrained devices. In recent years, neural network quantization techniques that are capable of producing extremely low-bit compression, for example, binarized RNNLMs, are gaining increasing research interests. Directly training of quantized neural networks is difficult. By formulating quantized RNNLMs training as an optimization problem, this paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM). This method can also flexibly adjust the trade-off between the compression rate and model performance using tied low-bit quantization tables. Experiments on two tasks: Penn Treebank (PTB), and Switchboard (SWBD) suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs. Faster convergence of 5 times in model training over the baseline binarized RNNLM quantization was also obtained.
机译:循环神经网络语言模型(RNNLM)的高内存消耗和计算成本限制了它们在资源受限设备上的广泛应用。近年来,能够产生极低位压缩的神经网络量化技术(例如,二值化RNNLM)越来越受到研究兴趣。直接训练量化神经网络是困难的。通过将量化的RNNLM训练公式化为一个优化问题,本文提出了一种使用乘法器交替方向法(ADMM)从头训练量化RNNLM的新方法。该方法还可以使用绑定的低位量化表灵活地调整压缩率和模型性能之间的折衷。在两项任务上的实验:宾夕法尼亚树库(PTB)和配电板(SWBD)表明,建议的ADMM量化在全精度基线RNNLM上实现了高达31倍的模型大小压缩因子。在基线二值化RNNLM量化模型训练中,也获得了5倍的更快收敛性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号