A processing apparatus supports a narrowing-and-rounding arithmetic operation which generates, in response to two operands each comprising at least one W-bit data element, a result value comprising at least one X-bit result data element. Each X-bit result data element represents a sum or difference of the corresponding W-bit data elements of the two operands rounded to an X-bit value (where W X). The arithmetic operation is implemented using a number of N-bit additions (N W), with carry values from a first stage of N-bit additions being added at a second stage of N-bit additions for adding a rounding value to the result of the first stage additions. This technique reduces the amount of time required for performing the narrowing-and rounding arithmetic operation.
展开▼