Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems

机译：选择自动语音识别系统培训的多类型广播数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper compares schemes for the selection of multi-genre broadcast data and corresponding transcriptions for speech recognition model training. Selections of the same amount of data (700 hours) from lightly supervised alignments based on the same original subtitle transcripts are compared. Data segments were selected according to a maximum phone matched error rate between the lightly supervised decoding and the original transcript. The data selected with an improved lightly supervised system yields lower word error rates (WERs). Detailed comparisons of the data selected on carefully transcribed development data show how the selected portions match the true phone error rate for each genre. From a broader perspective, it is shown that for different genres, either the original subtitles or the lightly supervised output should be used for model training and a suitable combination yields further reductions in final WER.

机译：本文比较了选择多类型广播数据的方案和语音识别模型训练的相应转录。比较基于相同的原始字幕转录物的轻微监督对准的相同数量的数据（700小时）。根据轻度监督解码和原始转录版之间的最大电话匹配错误率选择数据段。使用改进的轻微监控系统选择的数据产生较低的单词误差率（WERS）。详细比较在仔细转录的开发数据上选择的数据显示所选部分如何与每个类型的真正手机错误率匹配。从更广泛的角度来看，表明对于不同的类型，原始字幕或轻度监督的输出应该用于模型训练，并且合适的组合在最终WER中进一步降低。

著录项

来源
《Annual Conference of the International Speech Communication Association》|2016年|p2318-3105|共5页
会议地点
作者
P. Lanchantin; M.J.F. Gales; P. Karanasou; X. Liu; Y. Qian; L. Wang; P.C. Woodland; C. Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TB95-53;
关键词

相似文献

外文文献
中文文献
专利

1. Automatic Construction of a Large-Scale Speech Recognition Database Using Multi-Genre Broadcast Data with Inaccurate Subtitle Timestamps [J] . Jeong-Uk BANG, Mu-Yeol CHOI, Sang-Hun KIM, IEICE transactions on information and systems . 2020,第2期

机译：使用具有不准确字幕时间戳的多类型广播数据自动构建大型语音识别数据库
2. Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems [J] . Garcia-Moral A. I., Solera-Urena R., Pelaez-Moreno C., Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第3期

机译：数据平衡可有效训练混合ANN / HMM自动语音识别系统
3. An adapted data selection for deep learning-based audio segmentation in multi-genre broadcast channel [J] . Xu-Kui Yang, Dan Qu, Wen-Lin Zhang, Digital Signal Processing . 2018,第期

机译：多类型广播频道中基于深度学习的音频分割的适应性数据选择
4. Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems [C] . P. Lanchantin, M.J.F. Gales, P. Karanasou, Annual Conference of the International Speech Communication Association . 2016

机译：选择自动语音识别系统培训的多类型广播数据
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Development of A Two-Stage Procedure for the Automatic Recognition of Dysfluencies in the Speech of Children Who Stutter: I. Psychometric Procedures Appropriate for Selection of Training Material for Lexical Dysfluency Classifiers [O] . Peter Howell, Stevie Sackin, Kazan Glenn -1

机译：口吃儿童言语中流离失所的自动识别的两阶段程序的开发：I.适用于词汇流失分类器选择教材的心理测验程序
7. Optimal selection of speech data for Automatic Speech Recognition systems [O] . Nagórski A.B., Boves L.W.J., Steeneken H. 2002

机译：自动语音识别系统的语音数据最佳选择

Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems

摘要

著录项

相似文献

相关主题

期刊订阅