首页> 外国专利> A Text-to-Speech Synthesis Method and System, a Method of Training a Text-to-Speech Synthesis System, and a Method of Calculating an Expressivity Score

A Text-to-Speech Synthesis Method and System, a Method of Training a Text-to-Speech Synthesis System, and a Method of Calculating an Expressivity Score

机译：文本到语音合成方法和系统，一种培训文本到语音合成系统的方法，以及计算表达率分数的方法

页面导航

摘要
著录项
相似文献

摘要

A text-to-speech synthesis method comprising: receiving text; inputting the received text in a prediction network; and generating speech data, wherein the prediction network comprises a neural network, and wherein the neural network is trained by: receiving a first training dataset comprising audio data and corresponding text data; acquiring an expressivity score for each audio sample of the audio data, wherein the expressivity score is a quantitative representation of how well an audio sample conveys emotional information and sounds natural, realistic and human-like; training the neural network using a first sub-dataset, and further training the neural network using a second sub-dataset, wherein the first sub-dataset and the second sub-dataset comprise audio samples and corresponding text from the first training dataset and wherein the average expressivity score of the audio data in the second sub-dataset is higher than the average expressivity score of the audio data in the first sub-dataset.

机译：文本到语音合成方法，包括：接收文本;在预测网络中输入所接收的文本;并且生成语音数据，其中预测网络包括神经网络，并且其中神经网络训练通过：接收包括音频数据和相应文本数据的第一训练数据集;获取音频数据的每个音频样本的表达率分数，其中，表达率分数是音频样本传送情绪信息的程度的定量表示，并且听起来自然，现实和人类的声音;使用第一子数据集训练神经网络，并使用第二子数据集进一步训练神经网络，其中第一子数据集和第二子数据集包括来自第一训练数据集的音频样本和对应文本，并且其中第二子数据集中的音频数据的平均表达性得分高于第一子数据集中的音频数据的平均表达性分数。

著录项

公开/公告号WO2021123792A1

专利类型
公开/公告日2021-06-24

原文格式PDF
申请/专利权人 SONANTIC LIMITED;
展开▼

申请/专利号WO2020GB53266
发明设计人 FLYNN JOHN;QURESHI ZEENAT;
展开▼

申请日2020-12-17
分类号G10L13/02;G10L13/047;
国家 GB
入库时间 2022-08-24 19:36:55

相似文献

专利
外文文献
中文文献