首页> 外国专利> GENERATING AND USING TEXT-TO-SPEECH DATA FOR SPEECH RECOGNITION MODELS

GENERATING AND USING TEXT-TO-SPEECH DATA FOR SPEECH RECOGNITION MODELS

机译：生成和使用语音识别模型的文本到语音数据

页面导航

摘要
著录项
相似文献

摘要

Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.

机译：提供系统，方法和设备，用于生成和使用用于改进的语音识别模型的文本到语音（TTS）数据。主要模型接受关键字独立基线培训数据培训。在某些情况下，使用新的TTS训练数据进行修改主模型的声学和语言模型子组件。在某些情况下，新的TTS培训是从多扬声器神经TTS系统获得的，用于在基线训练数据中经过的关键字。在某些情况下，新的TTS培训数据用于关键字发现（KWS）应用程序中的关键字依赖信道分数的发音和标准化。在某些情况下，新的TTS培训数据用于语音识别模型中的快速扬声器适应。

著录项

公开/公告号US2021304769A1

专利类型
公开/公告日2021-09-30

原文格式PDF
申请/专利权人 MICROSOFT TECHNOLOGY LICENSING LLC;
展开▼

申请/专利号US202015931788
发明设计人 GUOLI YE;YAN HUANG;WENNING WEI;LEI HE;EVA SHARMA;JIAN WU;YAO TIAN;EDWARD C. LIN;YIFAN GONG;RUI ZHAO;JINYU LI;WILLIAM MAXWELL GALE;
展开▼

申请日2020-05-14
分类号G10L15/26;G10L15/16;G10L15/06;G10L13/08;
国家 US
入库时间 2022-08-24 21:21:49

相似文献

专利
外文文献
中文文献