In this paper, we propose a new method for calculating the output layer inneural machine translation systems. The method is based on predicting a binarycode for each word and can reduce computation time/memory requirements of theoutput layer to be logarithmic in vocabulary size in the best case. Inaddition, we also introduce two advanced approaches to improve the robustnessof the proposed model: using error-correcting codes and combining softmax andbinary codes. Experiments on two English-Japanese bidirectional translationtasks show proposed models achieve BLEU scores that approach the softmax, whilereducing memory usage to the order of less than 1/10 and improving decodingspeed on CPUs by x5 to x10.
展开▼