PROBLEM TO BE SOLVED: To improve the quality of a synthetic voice.;SOLUTION: A voice data storage part 11 stores voice data including F0 of a voice signal and a spectrum. An utterance information storage part 12 stores utterance information representing a time relation of each phoneme in the voice data. An F0 quantization part 13 generates arranged voice data obtained by sorting the voice data on the basis of the F0, clusters the arranged voice data with a value of the F0 regarded as a time, calculates a quantization threshold from the value of the F0 to be a boundary between clusters, and generates quantized F0 information obtained by quantizing the F0 on the basis of the quantization threshold. A model learning part 16 uses the voice data, the utterance information, and the quantized F0 information to learn a voice synthesis model.;SELECTED DRAWING: Figure 5;COPYRIGHT: (C)2016,JPO&INPIT
展开▼