A method that uses at least one hardware processor for receiving a three-dimensional model of an object, receiving an audio sequence embodied as a digital file that comprises a musical composition, generating a video frame sequence, and synthesizing the audio sequence and the video frame sequence into an audiovisual clip. The three-dimensional model is embodied as a digital file that comprises a representation of the object. The generating step comprises computing a caricature of the object by applying a computerized caricaturization algorithm to the three-dimensional model. The computing has scaling gradient fields of surface coordinates of the three-dimensional model by a function of a Gaussian curvature of the surface, and finding a regular surface whose gradient fields fit the scaled gradient fields. The computing is with a different exaggeration factor for each of multiple ones of the video frames, and the different exaggeration factor is based on one or more parameters of the musical composition of the audio sequence.
展开▼