Accurate endpoint detection is important to improve the speech recognition capability. This paper proposes a novel endpoint detection method which combines energy-based and likelihood ratio-based voice activity detection (VAD) criteria, where the likelihood ratio is calculated with speechon-speech Gaussian mixture models (GMMs). Moreover, the proposed method introduces the discriminative feature extraction method (DFE) in order to improve the speechon-speech classification. The DFE is used in the training of parameters required for calculating the likelihood ratio. Our experimental evaluation showed that the proposed method reduces the recognition error rate compared to a conventional energy-based technique.
展开▼