We propose a novel approach for estimating a reverberation model for a robust recognizer according to [1], which is designed to allow distant-talking automatic speech recognition (ASR) in reverberant environments. Based on a few calibration utterances with known transcriptions recorded in the target environment, a maximum likelihood estimator is used to find the means and variances of the reverberation model. In contrast to [1] and to HMM training on artificially reverberated training data (e. g. [2]), measurements of room impulse responses become unnecessary, and the effort for training is greatly reduced. Simulations of a connected digit recognition task show that, in highly reverberant environments, the reverberation models estimated by the proposed approach achieve significantly higher recognition rates than HMMs trained on reverberant data.
展开▼