Methods, systems and computer program products for accelerating sorting of data are provided herein. A computer-implemented method includes retrieving a plurality of cache lines of data from an input buffer, wherein each cache line comprises a plurality of elements, scattering the plurality of elements of each retrieved cache line into a plurality of bins, wherein said scattering comprises using one or more vector instructions, forming a bin cache line in a corresponding one of the plurality of bins, wherein the bin cache line comprises a group of the plurality of elements which were scattered to the corresponding one of the plurality of bins, writing the bin cache line from the corresponding one of the plurality of bins to a memory, and loading the bin cache line from the memory to the input buffer.
展开▼