This paper presents different methods using a weighted mixture of word and word-class language models in order to perform language model adaptation. A general language model is built from the whole training corpus, then several numbers of clusters are created according to a word co-occurrence measure and finally, word models as well as word-class models are built from each cluster. The general language model is then combined with one or several other models chosen according to a minimum perplexity criterion. Results show an absolute reduction of the word error rate of 1.40 and 0.49 on average for two different test sets of the "Corpus of Spontaneous Japanese."
展开▼