首页> 外国专利> DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

机译：数据保留限制下语音识别语言模型训练的数据粉碎

页面导航

摘要
著录项
相似文献

摘要

Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.

机译：使用实际用户数据来训练语音识别器（例如其语言或声学模型）是有用的，但是由于法规的原因，保留个人身份信息可能会受到限制。因此，提供了一种用于训练语言模型的方法或系统，该方法或系统包括在文本语料库中产生文本段以及与文本段相对应的计数，该文本语料库处于非人格化状态。该方法还包括使系统能够使用处于非个性化状态的文本段和计数来训练语言模型。由于数据是非个人化的，因此可以使用实际数据，从而使语音识别器能够及时了解用户的语音和使用趋势，以及其他好处。

著录项

公开/公告号US2014278425A1

专利类型
公开/公告日2014-09-18

原文格式PDF
申请/专利权人 NUANCE COMMUNICATIONS INC.;
展开▼

申请/专利号US201313800738
发明设计人 WILLIAM F. GANONG III;PHILIP CHARLES WOODLAND;UWE HELMUT JOST;SYED RAZA SHAHID;PAUL J. VOZILA;MARCEL KATZ;
展开▼

申请日2013-03-13
分类号G10L15/06;
国家 US
入库时间 2022-08-21 16:09:46

相似文献

专利
外文文献
中文文献