首页> 外国专利> DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

机译:数据保留限制下语音识别语言模型训练的数据粉碎

摘要

Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.
机译:使用实际用户数据来训练语音识别器(例如其语言或声学模型)是有用的,但是由于法规的原因,保留个人身份信息可能会受到限制。因此,提供了一种用于训练语言模型的方法或系统,该方法或系统包括在文本语料库中产生文本段以及与文本段相对应的计数,该文本语料库处于非人格化状态。该方法还包括使系统能够使用处于非个性化状态的文本段和计数来训练语言模型。由于数据是非个人化的,因此可以使用实际数据,从而使语音识别器能够及时了解用户的语音和使用趋势,以及其他好处。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号