Deconvolution of the speech excitation (source) and vocal tractud(filter) components through log-magnitude spectral processingudis well-established and has led to the well-known cepstral featuresudused in a multitude of speech processing tasks. This paperudpresents a novel source-filter decomposition based on processingudin the phase domain. We show that separation betweenudsource and filter in the log-magnitude spectra is far fromudperfect, leading to loss of vital vocal tract information. It isuddemonstrated that the same task can be better performed byudtrend and fluctuation analysis of the phase spectrum of theudminimum-phase component of speech, which can be computedudvia the Hilbert transform. Trend and fluctuation can be separatedudthrough low-pass filtering of the phase, using additivity ofudvocal tract and source in the phase domain. This results in separatedudsignals which have a clear relation to the vocal tract andudexcitation components. The effectiveness of the method is putudto test in a speech recognition task. The vocal tract componentudextracted in this way is used as the basis of a feature extractionudalgorithm for speech recognition on the Aurora-2 database.udThe recognition results shows upto 8.5% absolute improvementudin comparison with MFCC features on average (0-20dB).
展开▼