首页> 外文期刊>Microprocessors and microsystems >VThreads: A novel VLIW chip multiprocessor with hardware-assisted PThreads
【24h】

VThreads: A novel VLIW chip multiprocessor with hardware-assisted PThreads

机译:VThreads:具有硬件辅助PThread的新型VLIW芯片多处理器

获取原文
获取原文并翻译 | 示例

摘要

We discuss VThreads, a novel VLIW CMP with hardware-assisted shared-memory Thread support. VThreads supports Instruction Level Parallelism via static multiple-issue and Thread Level Parallelism via hardware-assisted POSIX Threads along with extensive customization. It allows the instantiation of tightly-coupled streaming accelerators and supports up to 7-address Multiple-Input, Multiple-Output instruction extensions. VThreads is designed in technology-independent Register-Transfer-Level VHDL and prototyped on 40 nm and 28 nm Field-Programmable gate arrays. It was evaluated against a PThreadsbased multiprocessor based on the Sparc-V8 ISA. On a 65 nm ASIC implementation VThreads achieves up to x7.2 performance increase on synthetic benchmarks, x5 on a parallel Mandelbrot implementation, 66% better on a threaded JPEG implementation, 79% better on an edge-detection benchmark and similar to 13% improvement on DES compared to the Leon3MP CMP. In the range of 2 to 8 cores, VThreads demonstrates a post-route (statistical) power reduction between 65% and 57% at an area increase of 1.2%-10% for 1-8 cores, compared to a similarly-configured Leon3MP CMP. This combination of micro-architectural features, scalability, extensibility, hardware support for low-latency PThreads, power efficiency and area make the processor an attractive proposition for low-power, deeply-embedded applications requiring minimum OS support. (C) 2016 Elsevier B.V. All rights reserved.
机译:我们将讨论VThreads,这是一种新颖的VLIW CMP,具有硬件辅助的共享内存线程支持。 VThreads通过静态多问题支持指令级并行性,并通过硬件辅助的POSIX线程以及广泛的自定义来支持线程级并行性。它允许实例化紧密耦合的流加速器,并支持多达7个地址的多输入,多输出指令扩展。 VThreads采用独立于技术的寄存器传输级VHDL设计,并在40 nm和28 nm现场可编程门阵列上原型化。已针对基于Sparc-V8 ISA的基于PThreads的多处理器进行了评估。在65 nm ASIC实现上,VThreads在合成基准测试上可实现高达x7.2的性能提升,在并行Mandelbrot实施上可实现x5的性能,在螺纹JPEG实施上提高66%,在边缘检测基准上提高79%,并且提高了13%在DES上与Leon3MP CMP相比。与类似配置的Leon3MP CMP相比,VThreads在2至8核的范围内证明了路由后(统计)功耗降低了65%至57%,而1-8核的面积增加了1.2%-10%。 。微体系结构功能,可伸缩性,可扩展性,对低延迟PThreads的硬件支持,能效和面积的完美结合,使该处理器成为低功耗,深度嵌入式应用(仅需最少的OS支持)的诱人主张。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号