This paper describes a method for the unsupervised and gender-independent estimation of the average human vocal tract length from the speech waveform, and reports results obtained on Fant's (1960) X-ray vowel data as well as results from experiments performed on multiple sentence utterances of 86 male and 78 female TIMIT speakers, including correlation analyses between the vocal tract length estimates and given body heights. The investigated error criteria that make non-iterative, closed-form estimator solutions possible are all found to achieve good speaker clustering potential for both male and female subgroups.
展开▼