With recent developments in controlling the prosodic output of speech synthesizers[1], the quality of synthetic speech has improved considerably. However, determining the prosody required to convey specific linguistic concepts is still a largely unsolved problem. Concept-to-speech systems seem the most promising: additional information (structuring, focussing, affirmation/negation, quotation, enumeration, time/date, salutation, speaker attitude, etc.) is available to the prosody generation algorithm. This paper describes a method for determining which linguistic concepts are present in the prosody of a spoken utterance and which should therefore be taken into account when modelling prosody.
展开▼