首页> 外文会议>IEEE International Conference on Cluster Computing >Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning
【24h】

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning

机译:基于流的在线深度学习的数据生命感知模型更新策略

获取原文

摘要

Many deep learning applications deployed in dynamic environments change over time, in which the training models are supposed to be continuously updated with streaming data in order to guarantee better descriptions on data trends. However, most of the state-of-the-art learning frameworks support well in offline training methods while omitting online model updating strategies. In this work, we propose and implement iDlaLayer, a thin middleware layer on top of existing training frameworks that streamlines the support and implementation of online deep learning applications. In pursuit of good model quality as well as fast data incorporation, we design a Data Life Aware model updating strategy (DLA), which builds training data samples according to contributions of data from different life stages, and considers the training cost consumed in model updating. We evaluate iDlaLayer's performance through both simulations and experiments based on TensorflowOnSpark with three representative online learning workloads. Our experimental results demonstrate that iDlaLayer reduces the overall elapsed time of MNIST, Criteo and PageRank by 11.3%, 28.2% and 15.2% compared to the periodic update strategy, respectively. It further achieves an average 20% decrease in training cost and brings about 5 % improvement in model quality against the traditional continuous training method.
机译:部署在动态环境中的许多深度学习应用程序会随着时间而变化,其中训练模型应使用流数据进行持续更新,以确保更好地描述数据趋势。但是,大多数最新的学习框架在脱机训练方法中都很好地支持,而忽略了在线模型更新策略。在这项工作中,我们提议并实施iDlaLayer,这是在现有培训框架之上的薄中间件层,可简化在线深度学习应用程序的支持和实施。为了追求良好的模型质量以及快速的数据合并,我们设计了一种数据生命感知模型更新策略(DLA),该策略根据不同生命阶段的数据贡献来构建训练数据样本,并考虑模型更新所消耗的训练成本。我们通过基于TensorflowOnSpark的仿真和实验,通过三个具有代表性的在线学习工作负载,来评估iDlaLayer的性能。我们的实验结果表明,与定期更新策略相比,iDlaLayer将MNIST,Criteo和PageRank的总运行时间减少了11.3%,28.2%和15.2%。与传统的连续训练方法相比,它可以进一步平均减少20%的训练成本,并使模型质量提高5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号