PROBLEM TO BE SOLVED: To provide a cross lingual voice synthesis technology which can synthesize voice in a language to be synthesized by a target speaker even when there is only one person's voice data in the language to be synthesized that is not voice data of the target speaker.;SOLUTION: There are provided: a time information adjustment part 101 which generates after time information adjustment target speaker voice data n and after time information adjustment learning target language voice data n from target speaker voice data and learning target language voice data n; a vocal quality converter learning part 103 which learns a non-specific speaker vocal quality converter from a set of the after time information adjustment target speaker voice data n and the after time information adjustment learning target language voice data n; a vocal quality conversion part 111 which uses the non-specific speaker vocal quality converter to generate after vocal quality conversion language to be synthesized voice data having a target speaker's vocal quality from the language to be synthesized voice data; and a synthesis model learning part 113 which learns a cross lingual voice synthesis model from the after vocal quality conversion language to be synthesized voice data and a set of the language to be synthesized speech information.;SELECTED DRAWING: Figure 2;COPYRIGHT: (C)2018,JPO&INPIT
展开▼
机译:解决的问题:提供一种跨语言的语音合成技术,即使在要合成的语言中只有一个人的语音数据不是目标语音数据的情况下,也可以以目标说话人合成的语言合成语音解决方案:提供:时间信息调整部分101,其在时间信息调整目标说话者语音数据 n Sub>之后和时间信息调整学习目标语言语音数据 n Sub之后产生。 >从目标说话者语音数据和学习目标语言语音数据 n Sub>;声音质量转换器学习部分103,其从一组时间后信息调整目标说话者语音数据 n Sub>和时间后信息调整学习目标语言语音数据< Sub> n Sub>;声音质量转换部分111,其使用非特定的说话者声音质量转换器从合成声音数据中生成具有目标说话者声音质量的合成声音数据之后的声音质量转换语言;合成模型学习部分113,从后语音质量转换语言和要合成的语音信息的语言集合中学习跨语言语音合成模型。选图:图2;版权:(C )2018,日本特许厅&INPIT
展开▼