首页> 外文会议>International Conference on speech and computer >Semi-Supervised Acoustic Model Retraining for Medical ASR
【24h】

Semi-Supervised Acoustic Model Retraining for Medical ASR

机译:用于医学ASR的半监督声学模型再训练

获取原文

摘要

Training models for speech recognition usually requires accurate word-level transcription of available speech data. For the domain of medical dictations, it is common to have "semi-literal" transcripts available: large numbers of speech files along with their associated formatted episode report, whose content only partially overlaps with the spoken content of the dictation. We present a semi-supervised method for generating acoustic training data by decoding dictations with an existing recognizer, confirming which sections are correct by using the associated report, and repurposing these audio sections for training a new acoustic model. The effectiveness of this method is demonstrated in two applications: first, to adapt a model to new speakers, resulting in a 19.7% reduction in relative word errors for these speakers; and second, to supplement an already diverse and robust acoustic model with a large quantity of additional data (from already known voices), leading to a 5.0% relative error reduction on a large test set of over one thousand speakers.
机译:用于语音识别的训练模型通常需要对可用语音数据进行准确的单词级转录。对于医学听写领域,通常可以使用“半文字”的成绩单:大量语音文件及其关联的格式化情节报告,其内容仅部分与听写内容相重叠。我们提出了一种半监督的方法,用于通过使用现有的识别器对口述进行解码,通过使用关联的报告确认哪些部分是正确的,以及重新利用这些音频部分来训练新的声学模型,来生成声学训练数据。这种方法的有效性在两个应用中得到了证明:首先,使模型适应新的说话者,从而使这些说话者的相对单词错误减少19.7%;其次,用大量附加数据(来自已知声音)补充已经多样化且健壮的声学模型,从而在超过一千个扬声器的大型测试装置上将相对误差降低了5.0%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号