A processing unit performs multiply-and-accumulate (MAC) operations on asymmetrically quantized data. The processing unit includes a MAC hardware unit to perform the MAC operations on a first data sequence and a second data sequence to generate an asymmetric MAC output. Both the first data sequence and the second data sequence are asymmetrically quantized. The processing unit further includes an accumulator hardware unit to accumulate the first data sequence concurrently with the MAC operations to generate an accumulated output. The processing unit further includes a multiply-and-add (MAD) hardware unit to multiply the accumulated output with a second offset to generate a multiplication output, and to add the multiplication output, the asymmetric MAC output and a pre-computed value calculated before runtime to generate a final output. The second offset indicates an amount of asymmetry of the second data sequence with respect to zero.
展开▼