首页> 外文会议>International conference on computational linguistics >Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features
【24h】

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

机译:具有密集特征的神经网络和结构化模型的异步并行学习

获取原文

摘要

Existing asynchronous parallel learning methods are only for the sparse feature models, and they face new challenges for the dense feature models like neural networks (e.g., LSTM, RNN). The problem for dense features is that asynchronous parallel learning brings gradient errors derived from overwrite actions. We show that gradient errors are very common and inevitable. Nevertheless, our theoretical analysis shows that the learning process with gradient errors can still be convergent towards the optimum of objective functions for many practical applications. Thus, we propose a simple method AsynGrad for asynchronous parallel learning with gradient error. Base on various dense feature models (LSTM, dense-CRF) and various NLP tasks, experiments show that AsynGrad achieves substantial improvement on training speed, and without any loss on accuracy.
机译:现有的异步并行学习方法仅用于稀疏特征模型,并且对于诸如神经网络(例如,LSTM,RNN)的稠密特征模型面临新的挑战。密集功能的问题在于异步并行学习会带来源自覆盖操作的梯度误差。我们表明,梯度误差非常普遍且不可避免。尽管如此,我们的理论分析表明,对于许多实际应用,具有梯度误差的学习过程仍可以朝着目标函数的最优方向收敛。因此,我们提出了一种用于梯度误差异步并行学习的简单方法AsynGrad。基于各种密集特征模型(LSTM,密集CRF)和各种NLP任务,实验表明,AsynGrad在训练速度上实现了实质性的提高,而准确性没有任何损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号