The present disclosure provides a method and a device for extracting an acoustic feature based on a convolution neural network and a terminal device. The method includes: arranging speech to be recognized into a speech spectrogram with a predetermined dimension number; and recognizing the speech spectrogram with the predetermined dimension number by the convolution neural network to obtain the acoustic feature of the speech to be recognized.
展开▼