Data link compression can efficiently compress the data stream between main memory and the processor chip in single processor systems. By dynamically updating a value cache on each side of the link with the most frequently transmitted values, frequent value encoding can compress the data stream by up to 70%. Unfortunately, the number of value caches needed grows quadratically with the number of nodes in multiprocessors which causes a scalability problem. This paper shows that by sharing the caches between different pairs of communicating nodes, the frequent values stored at each node can be utilized more efficiently. For interconnects with point-to-point links, it is shown, however, that sharing of caches introduces overhead traffic for keeping the value caches consistent. If all misses in the shared cache are broadcast to all other nodes, the generated traffic becomes so large, that it is better to transmit the values uncompressed. We propose and evaluate three techniques that aim at reducing this overhead and find that it is possible to reduce most of this traffic, but at the cost of less efficient compression and the final result is comparable to using dedicated value caches.
展开▼