A novel framework of robust speech understanding is presented. It is based on a detection and verification strategy. It extracts the semantically significant parts and rejects the irrelevant parts rather than decoding the whole utterances. There are two key features in the strategy. Firstly, the discriminative verifier is integrated to suppress false alarms. It uses anti-subword models specifically trained to verify the recognition results. The second feature is the use of a key-phrase network as the detection unit. It embeds a stochastic constraint of keyword and key-phrase connections to improve the coverage and detection rates. The automatic generation of the key-phrase network structure is also addressed. This top-down variable-length language model can be trained with a small corpus and ported to different tasks. This property coupled with the vocabulary-independent detector and verifier enhances the portability of the framework.
展开▼