This paper discusses two novel acoustic features for speech recognition from a new perspective. Most conventional acoustic features (e.g., cepstrum) are used to compare a pair of spectra in terms of "vertical" differences at the same frequency points. On the other hand, LSP frequencies provide a means of comparing spectra in terms of "horizontal" differences along frequency axis reflecting the formant frequency mismatches. After discussing these existing categories of acoustic feature parameters, we propose another category that represents spectrum intensities with an adaptively stretching frequency axis. We propose two novel features in the new category; one is the logarithmic difference of adjacent LSP (Line Spectrum Pair) frequencies; the other is the CSM (Composite Sinusoidal Modeling) intensity pairs. Their theoretical properties are discussed. Through continuous speech recognition experiments based on triphone HMM using LSP frequencies, MFCC and two new features, it was found that the new features performed better than LSP frequencies but not better than MFCCs.
展开▼