首页>
外国专利>
COMPUTE ENGINE ARCHITECTURE TO SUPPORT DATA-PARALLEL LOOPS WITH REDUCTION OPERATIONS
COMPUTE ENGINE ARCHITECTURE TO SUPPORT DATA-PARALLEL LOOPS WITH REDUCTION OPERATIONS
展开▼
机译:计算引擎架构以减少操作支持数据并行循环
展开▼
页面导航
摘要
著录项
相似文献
摘要
Techniques involving a compute engine architecture to support data-parallel loops with reduction operations are described. In some embodiments, a hardware processor includes a memory unit and a plurality of processing elements (PEs). Each of the PEs is directly coupled via one or more neighbor-to-neighbor links with one or more neighboring PEs so that each PE can receive a value from a neighboring PE, provide a value to a neighboring PE, or both receive a value from one neighboring PE and also provide a value to another neighboring PE. The hardware processor also includes a control engine coupled with the plurality of PEs that is to cause the plurality of PEs to collectively perform a task to generate one or more output values by each performing one or more iterations of a same subtask of the task.
展开▼