Vowel classification models trained on production data typically have higher correlation with human listeners' perception when the acoustic properties of the production data are normalised prior to training and .testing. Vowel normalisation procedures seek to remove inter-speaker variance due to factors such as vocal tract size, which human listeners discount when identifying vowels. Extrinsic normalisation makes use of information from a representative sample of a speaker's vowel inventory.
展开▼