A talented CPU-to-GPU memory mapping technique

机译：出色的CPU到GPU内存映射技术

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In order to fast effective analysis of large systems, high performance computing (HPC) is essential. NVIDIA Compute Unified Device Architecture (CUDA)-assisted central processing unit (CPU) and graphics processing unit (GPU) computing platform has proven its potential to be used for HPC supports. In CPU/GPU computing, original data and instructions are copied from CPU-main-memory to GPU-global-memory. Inside GPU, it would be beneficial to keep the data into shared memory (shared only by the threads of that block) than in the global memory (shared by all threads). However, GPU shared memory is much smaller than GPU global memory (for Fermi Tesla C2075, total shared memory per block is 48 KB and total global memory is 5.6 GB). In this paper, we introduce a CPU-main-memory to GPU-global-memory mapping technique to improve the GPU/overall system performance by increasing the effectiveness of GPU shared memory. Experimental results, from solving Laplace's equation for 512×512 matrix using Fermi and Kepler cards, show that proposed CPU-to-GPU memory mapping technique help decrease the overall execution time by more than 75%.

机译：为了快速有效地分析大型系统，高性能计算（HPC）是必不可少的。 NVIDIA Compute Unified设备体系结构（CUDA）辅助的中央处理单元（CPU）和图形处理单元（GPU）计算平台已证明其潜力可用于HPC支持。在CPU / GPU计算中，原始数据和指令从CPU主内存复制到GPU全局内存。在GPU内部，将数据保留在共享内存（仅由该块的线程共享）中而不是全局内存（由所有线程共享）中将是有益的。但是，GPU共享内存比GPU全局内存小得多（对于Fermi Tesla C2075，每个块的总共享内存为48 KB，而总全局内存为5.6 GB）。在本文中，我们将CPU主内存引入到GPU全局内存映射技术中，以通过提高GPU共享内存的有效性来提高GPU /整体系统的性能。通过使用Fermi和Kepler卡求解512×512矩阵的拉普拉斯方程组的实验结果表明，提出的CPU到GPU的内存映射技术有助于将总体执行时间减少75％以上。

著录项

来源
《IEEE SoutheastCon 》|2014年|1-6|共6页
会议地点 Lexington KT(US)
作者
Asaduzzaman Abu; Gummadi Deepthi; Yip Chok M.;
展开▼
作者单位

Department of Electrical Engineering and Computer Science Wichita State University Kansas USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Central Processing Unit; Graphics processing units; Instruction sets; Multicore processing; Organizations; Parallel processing; Random access memory; CUDA architecture; Cache memory organization; GPU memory; electric charge distribution; high performance computing;

机译：中央处理器;图形处理单元；指令集；多核处理；组织；并行处理;随机存取存储器; CUDA体系结构；缓存内存组织； GPU内存；电荷分布；高性能计算;

相似文献

外文文献
中文文献
专利

1. Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories [J] . Yoon Hanbin, Meza Justin, Muralimanohar Naveen, ACM Transactions on Architecture and Code Optimization . 2014 ,第4期

机译：多级单元相变存储器的高效数据映射和缓冲技术
2. The Effect of EEG-Biofeedback Method on Memory Performance of Gifted and Talented Children [J] . Varli Mehmet Fatih, Steffert Tony Applied psychophysiology and biofeedback . 2015 ,第4期

机译：脑电生物反馈方法对有天赋和才华的孩子的记忆力的影响
3. IMechE supports scholarship in memory of talented young engineer [J] . Professional engineering . 2015 ,第8期

机译：IMechE支持奖学金以纪念有才华的年轻工程师
4. A Systematic Mapping Review of Memory Leak Detection Techniques [C] . Guilherme Otávio de Sena, Rivalino Matias IEEE International Symposium on Software Reliability Engineering Workshops . 2018

机译：内存泄漏检测技术的系统映射回顾
5. Development of predictive mapping techniques for soil survey and salinity mapping. [D] . Elnaggar, Abdelhamid A. 2007

机译：开发用于土壤调查和盐度测绘的预测测绘技术。
6. A novel mapping technique to detect non–pulmonary vein triggers: A case report of self-reference mapping technique [O] . Yasuharu Matsunaga-Lee, Yuzuru Takano 2018

机译：一种检测非肺静脉触发因素的新颖映射技术：自参考映射技术的病例报告
7. THE EFFECTIVENESS OF MIND MAPPING TECHNIQUES IN COUNSELING OF ENHANCEMENT ABILITY MEMORY STUDENTS IN LEARNING [O] . Irman Irman 2019

机译：思想映射技术在增强能力记忆学生咨询中的效果

A talented CPU-to-GPU memory mapping technique

摘要

著录项

相似文献

相关主题

期刊订阅