首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
【24h】

Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition

机译:通过块项张量分解学习紧凑型递归神经网络

获取原文

摘要

Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. However, when dealing with high dimensional inputs, the training of RNNs becomes computational expensive due to the large number of model parameters. This hinders RNNs from solving many important computer vision tasks, such as Action Recognition in Videos and Image Captioning. To overcome this problem, we propose a compact and flexible structure, namely Block-Term tensor decomposition, which greatly reduces the parameters of RNNs and improves their training efficiency. Compared with alternative low-rank approximations, such as tensortrain RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only more concise (when using the same rank), but also able to attain a better approximation to the original RNNs with much fewer parameters. On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes 17,388 times fewer parameters than the standard LSTM to achieve an accuracy improvement over 15.6% in the Action Recognition task on the UCF11 dataset.
机译:递归神经网络(RNN)是功能强大的序列建模工具。然而,当处理高维输入时,由于大量的模型参数,训练RNN变得计算量大。这阻碍了RNN解决许多重要的计算机视觉任务,例如视频中的动作识别和图像字幕。为了克服这个问题,我们提出了一种紧凑而灵活的结构,即块项张量分解,它可以大大减少RNN的参数并提高其训练效率。与Tensortrain RNN(TT-RNN)等替代低秩近似相比,我们的方法Block-Term RNN(BT-RNN)不仅更简洁(当使用相同秩时),而且能够获得参数少得多的原始RNN的更好近似。在三项具有挑战性的任务中,包括视频中的动作识别,图像字幕和图像生成,BT-RNN在预测准确性和收敛速度方面均优于TT-RNN和标准RNN。具体而言,BT-LSTM使用的参数比标准LSTM少17388倍,从而在UCF11数据集上的“动作识别”任务中实现了超过15.6%的精度改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号