Graphics Processing Units (GPUs) support dynamic voltage and frequencyscaling (DVFS) in order to balance computational performance and energyconsumption. However, there still lacks simple and accurate performanceestimation of a given GPU kernel under different frequency settings on realhardware, which is important to decide best frequency configuration for energysaving. This paper reveals a fine-grained model to estimate the execution timeof GPU kernels with both core and memory frequency scaling. Over a 2.5x rangeof both core and memory frequencies among 12 GPU kernels, our model achievesaccurate results (within 3.5%) on real hardware. Compared with the cycle-levelsimulators, our model only needs some simple micro-benchmark to extract a setof hardware parameters and performance counters of the kernels to produce thishigh accuracy.
展开▼