A voice quality conversion system includes: an analysis unit which analyzes sounds of plural vowels of different types to generate first vocal tract shape information for each type of the vowels; a combination unit which combines, for each type of the vowels, the first vocal tract shape information on that type of vowel and the first vocal tract shape information on a different type of vowel to generate second vocal tract shape information on that type of vowel; and a synthesis unit which (i) combines vocal tract shape information on a vowel included in input speech and the second vocal tract shape information on the same type of vowel to convert vocal tract shape information on the input speech, and (ii) generates a synthetic sound using the converted vocal tract shape information and voicing source information on the input speech to convert the voice quality of the input speech.
展开▼