This research was supported in part by the China National Nature Science Foundation (No.91120303, No.61273267, No.90820011 and No.90820303). The decision function of support vector machine (SVM) using the likelihood ratios (LRs) is successfully used for statistical model-based voice activity detection (VAD). It is known to incorporate an optimised nonlinear decision over two different classes, instead of comparing the geometric mean of the LRs for the individual frequency bands with a given threshold for speech detection. However, the inter-frame correlation of the voice activity is not taken into consideration. In this paper, we explore a hybrid SVM/hidden Markov model (HMM) approach for the VAD, which retains discriminative and nonlinear properties of SVM, while modeling the inter-frame correlation powerfully through a first-order HMM. Experimental results show the significant improvement of the performance of the proposed VAD in comparison with the S VM-based VAD.
展开▼