Methods and apparatus for reducing accelerator memory access costs in multi-channel platforms. The apparatus includes a computing platform having a plurality of accelerators and a plurality of memory devices accessed via a plurality of memory channels. Jobs are transmitted via software running on the computing platform to access a function to be offloaded to an accelerator. Under the paged function, the accelerator accesses one or more buffers that collectively require access over multiple memory channels among the plurality of memory channels. Accelerators that have an available instance of the function are identified, and the total cost of accessing the one or more buffers over the multiple memory channels is calculated for each of the accelerators. The lowest cost accelerator is then selected to outsource the function. New instruction set architecture (ISA) instructions are also disclosed to identify memory pages and memory channels used for buffers.
展开▼