Speech segmentation at a phone level imposes high resolution requirements in the short-time analysis of the audio signal. In this work, we employ the Bayesian information criterion corrected for small samples and model speech samples with the generalised Gamma distribution, which offers a more efficient parametric characterisation of speech in the frequency domain than the Gaussian distribution. Using a computationally inexpensive maximum likelihood approach for parameter estimation, we attest that the proposed adjustments yield significant performance improvement in noisy environments.
展开▼