首页> 外文会议>Annual Conference of the International Speech Communication Association >Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems
【24h】

Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems

机译:选择自动语音识别系统培训的多类型广播数据

获取原文

摘要

This paper compares schemes for the selection of multi-genre broadcast data and corresponding transcriptions for speech recognition model training. Selections of the same amount of data (700 hours) from lightly supervised alignments based on the same original subtitle transcripts are compared. Data segments were selected according to a maximum phone matched error rate between the lightly supervised decoding and the original transcript. The data selected with an improved lightly supervised system yields lower word error rates (WERs). Detailed comparisons of the data selected on carefully transcribed development data show how the selected portions match the true phone error rate for each genre. From a broader perspective, it is shown that for different genres, either the original subtitles or the lightly supervised output should be used for model training and a suitable combination yields further reductions in final WER.
机译:本文比较了选择多类型广播数据的方案和语音识别模型训练的相应转录。比较基于相同的原始字幕转录物的轻微监督对准的相同数量的数据(700小时)。根据轻度监督解码和原始转录版之间的最大电话匹配错误率选择数据段。使用改进的轻微监控系统选择的数据产生较低的单词误差率(WERS)。详细比较在仔细转录的开发数据上选择的数据显示所选部分如何与每个类型的真正手机错误率匹配。从更广泛的角度来看,表明对于不同的类型,原始字幕或轻度监督的输出应该用于模型训练,并且合适的组合在最终WER中进一步降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号