首页> 外国专利> DETECTING AND RECOVERING OUT OF VOCABULARY WORDS IN SPEECH-TO-TEXT TRANSCRIPTION SYSTEMS

DETECTING AND RECOVERING OUT OF VOCABULARY WORDS IN SPEECH-TO-TEXT TRANSCRIPTION SYSTEMS

机译:语音文本转录系统中词汇表外词的检测与恢复

摘要

Aspects of the present disclosure describe methods for identifying and recovering out-of-vocabulary words in transcripts of a speech data recording using speech recognition models and phrase unit recognition models. An example method generally includes receiving a voice data recording for transcription into a textual representation of the voice data recording. The speech data record is transcribed into the text representation using a word recognition model. An unknown word is identified in the text representation and the unknown word is reconstructed based on a recognition of sub-units of the unknown word generated by a sub-unit recognition model. The textual representation of the speech data record is modified by replacing the unknown word with the reconstruction of the unknown word, and the modified textual representation is output.
机译:本发明的方面描述了使用语音识别模型和短语单元识别模型识别和恢复语音数据记录的转录本中词汇表外单词的方法。示例方法通常包括接收语音数据记录以转录成语音数据记录的文本表示。语音数据记录使用文字识别模型转录到文本表示中。在文本表示中识别未知词,并基于子单元识别模型生成的未知词的子单元识别来重构未知词。语音数据记录的文本表示通过用未知词的重建替换未知词来修改,并输出修改后的文本表示。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号