This paper explores optimization techniques of the synchronization mechanisms for MPSoCs based on complex interconnect (Network-on-Chip), targeted at future mobile -systems. We suggest the architecture of the memory controller optimized to minimize synchronization overhead. The proposed solution is based on the idea of performing synchronization operations which require the continuous polling of a shared variable, thus featuring large contention (e.g. spin locks), locally in the memory. We introduce a HW module, which augments the memory controller, the Synchronization-operation Buffer (SB), which queues and manages the requests issued by the processors. Experimental validation has been carried out by using GRAPES, a cycle-accurate performance/power simulation platform. For an 8-processor target architecture, we show that the proposed solution achieves up to 40% performance improvement and 25% energy saving with respect to synchronization based on the caching of the synchronization variables and directory-based coherency protocol. Furthermore, we prove the scalability of the proposed approach when the number of processors increases.
展开▼