Spoken language recognition meets with difficulties when an unknown word is encountered. In addition to the new word being unrecognisable, its presence impacts on recognition performance on the surrounding words. The possibility is explored here of using a back-off statistical recogniser to allow recognition of out-of-vocabulary words in a grammar-based speech recognition system. This study shows that a statistical language model created from a corpus obtained using a grammar-based system and augmented with minimally-constrained domain-appropriate material allows extraction of words that are out of the vocabulary of the grammar in an unseen corpus with fairly high precision.
展开▼