Ding et al. have explored a novel pitch-synchronous speech analysis-synthesis method[1] based on an auto-regressive with exogenous input (ARX) speech production model. This method makes an automatic estimation of the vocal tract (formant) and voice source parameters from a speech utterance. This method, however, has suffered deficiencies in the anlaysis of a hihg-pitch voice and the introduction of click sounds in the transition between vocalic and weak voiced consonantal segments. This paper proposes an improved ARX method in order to solve the problems mentioned above. Perceptual comparison experiments have shown that quality of re-synthesized speech by the proposed method is higher than that by a well-known cepstral method.
展开▼