In this paper we investigate on-line zero-crossing based audio stream segmentation and classification into speech and other segments. We consider such segments as applause, noise of the auditorium, and silence. We demonstrate that the features extracted from zero-crossing are stable and valid to be used for speech and other signal discrimination and classification and don't require large amount of data for the training. We describe the optimal segmentation of unlimited audio signals using results of the frames classification. We demonstrate that using optimal segmentation is better than using traditional sliding window technique.
展开▼