首页> 外文会议>IEEE International High Level Design Validation and Test Workshop >Modeling, programming and performance analysis of automotive environment map representations on embedded GPUs
【24h】

Modeling, programming and performance analysis of automotive environment map representations on embedded GPUs

机译:嵌入式GPU上汽车环境地图表示的建模,编程和性能分析

获取原文
获取原文并翻译 | 示例

摘要

Future Advanced Driver Assistance Systems (ADAS) require the continuous computation of detailed maps of the vehicle's environment. Due to the high demand of accuracy and the enormous amount of data to be fused and processed, common architectures used today, like single-core processors in automotive Electronic Control Units (ECUs), do not provide enough computing power. Here, emerging embedded multi-core architectures are appealing such as embedded Graphics Processing Units (GPUs). In this paper, we (a) identify and analyze common subalgorithms of ADAS algorithms for computing environment maps, such as interval maps, for suitability to be parallelized and run on embedded GPUs. From this analysis, (b) performance models are derived on achievable speedups with respect to sequential single-core CPU implementations. (c) As a third contribution of this paper, these performance models are validated by presenting and comparing a novel parallelized interval map GPU implementation against a parallel occupancy grid map implementation. For both types of environment maps, implementations on an Nvidia Tegra K1 prototype are compared to verify the correctness of the introduced performance models. Finally, the achievable speedups with respect to a single-core CPU solution are reported. These range from 3x to 275x for interval and grid map computations.
机译:未来的高级驾驶员辅助系统(ADAS)需要不断计算车辆环境的详细地图。由于对准确性的高要求以及要融合和处理的大量数据,当今使用的通用体系结构(例如汽车电子控制单元(ECU)中的单核处理器)无法提供足够的计算能力。在这里,新兴的嵌入式多核体系结构很有吸引力,例如嵌入式图形处理单元(GPU)。在本文中,我们(a)识别和分析用于计算环境图(例如区间图)的ADAS算法的常见子算法,以使其适合并行化并在嵌入式GPU上运行。从该分析中,可以得出(b)关于顺序单核CPU实现的可实现的加速性能模型。 (c)作为本文的第三点贡献,通过展示并比较新颖的并行化间隔图GPU实现与并行占用网格图实现,验证了这些性能模型。对于两种类型的环境图,将比较Nvidia Tegra K1原型上的实现,以验证引入的性能模型的正确性。最后,报告了相对于单核CPU解决方案可实现的加速。间隔和网格图计算的范围从3x到275x。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号