首页> 外国专利> DETECTING AND RECOVERING OUT-OF-VOCABULARY WORDS IN VOICE-TO-TEXT TRANSCRIPTION SYSTEMS

DETECTING AND RECOVERING OUT-OF-VOCABULARY WORDS IN VOICE-TO-TEXT TRANSCRIPTION SYSTEMS

机译:在语音到文本转录系统中检测和恢复词汇外单词

摘要

Aspects of the present disclosure describe techniques for identifying and recovering out-of-vocabulary words in transcripts of a voice data recording using word recognition models and word sub-unit recognition models. An example method generally includes receiving a voice data recording for transcription into a textual representation of the voice data recording. The voice data recording is transcribed into the textual representation using a word recognition model. An unknown word is identified in the textual representation, and the unknown word is reconstructed based on recognition of sub-units of the unknown word generated by a sub-unit recognition model. The textual representation of the voice data recording is modified by replacing the unknown word with the reconstruction of the unknown word, and the modified textual representation is output.
机译:本公开的各方面描述了用于使用Word识别模型和字子单元识别模型来识别和恢复语音数据记录的转录物中的词汇字形单词的技术。示例方法通常包括接收用于转录的语音数据记录到语音数据记录的文本表示中。使用单词识别模型转录语音数据记录到文本表示中。在文本表示中识别未知单词,并且基于识别由子单元识别模型生成的未知字的子单元的识别来重建未知字。通过用未知字的重建替换未知字来修改语音数据记录的文本表示,并输出修改后的文本表示。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号