首页> 外文期刊>IEICE Transactions on fundamentals of electronics, communications & computer sciences >Pipeline-Based Partition Exploration for Heterogeneous Multiprocessor Synthesis
【24h】

Pipeline-Based Partition Exploration for Heterogeneous Multiprocessor Synthesis

机译:异构多处理器综合的基于管道的分区探索

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

To achieve an automated implementation for the application-specific heterogeneous multiprocessor systems-on-chip (MP-SoC), partitioning and mapping the sequential programs onto multiple parallel processors is one of the most difficult challenges. However, the existing traditional parallelizing techniques cannot solve the MPSoC-related problems effectively, so designers are still required to manually extract the concurrency potentials in the program. To solve this bottleneck, an automated application partition technique is needed. However, completely automatic parallelism is ineffective, so it is promising to explore concurrency for certain practical special structures. To settle those issues, this paper proposes a template-based algorithm to automatically partition a special load-compute-store (LCS) loop structure. Since specific-instruction customization for the application specific instruction-set processors (ASIPs) has interactions with task partitioning, the proposed algorithm integrates the dynamic pipelining and ASIP techniques using an iterative improvement strategy: first, an initial pipelining scheme is generated to obtain the maximum parallelism; second, under the primary partition results specific instructions are customized respectively for each subprogram; third, the program is repartitioned via pipelining under the specific instruction configurations. The proposed method has been implemented in the context of a commercial extensible multiprocessor design flow, using the Xtensa-based XTMP platform from Tensilica Inc. Based on a case study of Fast Fourier Transform (FFT), the experimental results indicate that the partitioned programs by the proposed method demonstrate an average speedup of 10× compared to the original sequential programs which have not been partitioned and run on the uniprocessor system.
机译:为了实现针对特定用途的异构多处理器片上系统(MP-SoC)的自动化实现,将顺序程序分区和映射到多个并行处理器是最困难的挑战之一。但是,现有的传统并行化技术无法有效解决与MPSoC相关的问题,因此仍然需要设计人员手动提取程序中的并发潜力。为了解决这个瓶颈,需要一种自动的应用程序分区技术。但是,全自动并行是无效的,因此有希望探索某些实用的特殊结构的并发性。为了解决这些问题,本文提出了一种基于模板的算法来自动划分特殊的负载计算存储(LCS)循环结构。由于针对特定指令集处理器(ASIP)的特定指令定制与任务划分有交互作用,因此所提出的算法使用迭代改进策略将动态流水线和ASIP技术集成在一起:首先,生成初始流水线方案以获得最大数量。并行性第二,在主分区结果下,分别为每个子程序定制了特定的指令;第三,在特定指令配置下通过流水线对程序进行重新分区。使用Tensilica Inc.基于Xtensa的XTMP平台,在商业可扩展的多处理器设计流程中实现了该方法。基于快速傅里叶变换(FFT)的案例研究,实验结果表明,通过与尚未分区并在单处理器系统上运行的原始顺序程序相比,所提出的方法显示出10倍的平均加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号