A method of automatically generating a digital soundtrack intended for synchronised playback with associated speech audio, the method executed by a processing device or devices having associated memory. The method comprises syntactically and/or semantically analysing text representing or corresponding to the speech audio at a text segment level to generate an emotional profile for each text segment in the context of a continuous emotion model. The method further comprises generating a soundtrack for the speech audio comprising one or more audio regions that are configured or selected for playback during corresponding speech regions of the speech audio, and wherein the audio configured for playback in the audio regions is based on or a function of the emotional profile of one or more of the text segments within the respective speech regions.
展开▼