Towards Memory Friendly Long-Short Term Memory Networks (LSTMs) on Mobile GPUs

机译：在移动GPU上的记忆友好的长期记忆网络（LSTMS）

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intelligent Personal Assistants (IPAs) with the capability of natural language processing (NLP) are increasingly popular in today's mobile devices. Recurrent neural networks (RNNs), especially one of their forms - Long-Short Term Memory networks (LSTMs), are becoming the core machine learning technique applied in the NLP-based IPAs. With the continuously improved performance of mobile GPUs, local processing has become a promising solution to the large data transmission and privacy issues induced by the cloud-centric computations of IPAs. However, LSTMs exhibit quite inefficient memory access pattern when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth. In this study, we aim to explore the memory friendly LSTM on mobile GPUs by hierarchically reducing the off-chip memory accesses. To address the redundant data movements, we propose inter-cell level optimizations that intelligently parallelize the originally sequentially executed LSTM cells (basic units in RNNs, corresponding to neurons in CNNs) to improve the data locality across cells with negligible accuracy loss. To relax the pressure on limited off-chip memory bandwidth, we propose intra-cell level optimizations that dynamically skip the loads and computations of rows in the weight matrices with trivial contribution to the outputs. We also introduce a light-weighted module to the GPUs architecture for the runtime row skipping in weight matrices. Moreover, our techniques are equipped with thresholds which provide a unique tunning space for performance-accuracy trade-offs directly guided by the user preferences. The experimental results show our optimizations achieves substantial improvements on both performance and power with user-imperceptible accuracy loss. And our optimizations exhibit the strong scalability with the increasing input data set. Our user study also shows that our designed system delivers the excellent user experience.

机译：智能个人助理（IPAS）自然语言处理（NLP）的能力在今天的移动设备越来越普及。递归神经网络（RNNs），其形式尤其是 - 长短期记忆网络（LSTMs），正成为核心机器学习在基于自然语言处理，投资促进机构的应用技术。随着移动GPU的不断提高性能，本地处理已成为一个有前途的解决方案，通过投资促进机构的云计算为中心引起的大数据传输和隐私问题。然而，当上移动GPU由于冗余数据执行运动和有限的芯片外带宽LSTMs表现出相当低效的存储器存取模式。在这项研究中，我们的目标是通过分层减少截止芯片存储器访问，以探索在移动GPU内存友好LSTM。为了解决所述冗余数据的运动，我们提出小区间级别的优化，可以智能并行化原本顺序执行LSTM细胞（基本单元在RNNs，对应于神经元细胞神经网络），以改善具有可忽略的精度损失横跨单元中的数据局部性。放宽对限定的片外存储器带宽的压力，我们提出小区内电平的优化，而动态地跳过行的负载，并且计算与到输出琐碎贡献的权重矩阵。我们还介绍了光加权模块到GPU的架构为运行时行中的权重矩阵跳过。此外，我们的技术装备，其性能，精度取舍由用户的喜好直接引导提供了一个独特tunning空间阈值。实验结果表明，我们的优化实现了对性能和功耗与用户察觉不到的精度损失实质性的改进。而我们的优化表现出与增加输入数据集的可扩展性强。我们的用户研究还表明，我们设计的系统提供了卓越的用户体验。

著录项

来源
《International Symposium on Microarchitecture》|2018年|xxiv 493 p. :|共13页
会议地点
作者
Xingyao Zhang; Chenhao Xie; Jing Wang; Weidong Zhang; Xin Fu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP302-532;
关键词
Logic gates; Optimization; Recurrent neural networks; Natural language processing; Kernel; Bandwidth; Memory management;

机译：逻辑门;优化;经常性神经网络;自然语言处理;内核;带宽;内存管理;

相似文献

外文文献
中文文献
专利

1. Hybrid convolutional neural network (CNN) and long-short term memory (LSTM) based deep learning model for detecting shilling attack in the social-aware network [J] . Vivekanandan K., Praveena N. Journal of ambient intelligence and humanized computing . 2021,第1期

机译：混合卷积神经网络（CNN）和基于长短短期存储器（LSTM）的深度学习模型，用于检测社交意识网络的先令攻击
2. An integrated framework of Bi-directional long-short term memory (BiLSTM) based on sine cosine algorithm for hourly solar radiation forecasting [J] . Peng Tian, Zhang Chu, Zhou Jianzhong, Energy . 2021,第Apra15期

机译：基于正弦余辐射预测正弦余弦算法的双向长短路记忆（BILSTM）的综合框架
3. Inter-Sentence Segmentation of YouTube Subtitles Using Long-Short Term Memory (LSTM) [J] . Hye-Jeong Song, Hong-Ki Kim, Jong-Dae Kim, Applied Sciences . 2019,第7期

机译：使用长时记忆（LSTM）对YouTube字幕进行句间细分
4. Towards Memory Friendly Long-Short Term Memory Networks (LSTMs) on Mobile GPUs [C] . Xingyao Zhang, Chenhao Xie, Jing Wang, Annual IEEE/ACM International Symposium on Microarchitecture . 2018

机译：迈向移动GPU上的内存友好型长短期内存网络（LSTM）
5. Quantitative Trading Portfolio Optimization-Based Stock Prediction Using Long-Short Term Memory Network [D] . Hao, Ruizhi. 2021

机译：基于量化的贸易组合优化使用长短期内存网络的库存预测
6. Forecasting stock prices with long-short term memory neural network based on attention mechanism [O] . Jiayu Qiu, Bin Wang, Changjun Zhou 2020

机译：基于注意机制的长短期内存神经网络预测股票价格
7. Quantitative Stock Selection Model Based on Long-Short Term Memory (LSTM) Neural Network [O] . Xiao Wu, Yanqiu Tang 2021

机译：基于长短短期记忆（LSTM）神经网络的定量股票选择模型

Towards Memory Friendly Long-Short Term Memory Networks (LSTMs) on Mobile GPUs

摘要

著录项

相似文献

相关主题

期刊订阅