Semi-Supervised Acoustic Model Retraining for Medical ASR

机译：用于医学ASR的半监督声学模型再训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Training models for speech recognition usually requires accurate word-level transcription of available speech data. For the domain of medical dictations, it is common to have "semi-literal" transcripts available: large numbers of speech files along with their associated formatted episode report, whose content only partially overlaps with the spoken content of the dictation. We present a semi-supervised method for generating acoustic training data by decoding dictations with an existing recognizer, confirming which sections are correct by using the associated report, and repurposing these audio sections for training a new acoustic model. The effectiveness of this method is demonstrated in two applications: first, to adapt a model to new speakers, resulting in a 19.7% reduction in relative word errors for these speakers; and second, to supplement an already diverse and robust acoustic model with a large quantity of additional data (from already known voices), leading to a 5.0% relative error reduction on a large test set of over one thousand speakers.

机译：用于语音识别的训练模型通常需要对可用语音数据进行准确的单词级转录。对于医学听写领域，通常可以使用“半文字”的成绩单：大量语音文件及其关联的格式化情节报告，其内容仅部分与听写内容相重叠。我们提出了一种半监督的方法，用于通过使用现有的识别器对口述进行解码，通过使用关联的报告确认哪些部分是正确的，以及重新利用这些音频部分来训练新的声学模型，来生成声学训练数据。这种方法的有效性在两个应用中得到了证明：首先，使模型适应新的说话者，从而使这些说话者的相对单词错误减少19.7％;其次，用大量附加数据（来自已知声音）补充已经多样化且健壮的声学模型，从而在超过一千个扬声器的大型测试装置上将相对误差降低了5.0％。

著录项

来源
《International Conference on speech and computer》|2018年|177-187|共11页
会议地点
作者
Greg P. Finley; Erik Edwards; Wael Salloum; Amanda Robinson; Najmeh Sadoughi; Nico Axtmann; Maxim Korenevsky; Michael Brenndoerfer; Mark Miller; David Suendermann-Oeft;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Medical speech recognition; ASR; Medical dictation Acoustic modeling;

机译：医学语音识别; ASR;医学命令声学建模;

相似文献

外文文献
中文文献
专利

1. Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems’ Hypotheses [J] . Sheng Li, Yuya Akita, Tatsuya Kawahara Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第9期

机译：通过从多个ASR系统的假设中进行区分数据选择来半监督声学模型训练
2. Improving the Slovak LVCSR Performance by Cluster-Sensitive Acoustic Model Retraining [J] . Advances in Electrical and Electronic Engineering . 2015,第4期

机译：通过群集敏感声学模型再训练提高斯洛伐克语LVCSR性能
3. Development and analysis of Punjabi ASR system for mobile phones under different acoustic models [J] . Puneet Mittal, Navdeep Singh International journal of speech technology . 2019,第1期

机译：不同声学模型下手机旁遮普ASR系统的开发与分析
4. Semi-Supervised Acoustic Model Retraining for Medical ASR [C] . Greg P. Finley, Erik Edwards, Wael Salloum, International Conference on Speech and Computer . 2018

机译：医疗ASR的半监督声学模型刷新
5. Deep Neural Network acoustic models for ASR. [D] . Mohamed, Abdel-rahman. 2014

机译：适用于ASR的深度神经网络声学模型。
6. IMDAV reaction between phenylmaleic anhydride and thienyl(furyl)allylamines: synthesis and molecular structure of (3aSR4RS4aRS7aSR)-5-oxothieno- and (3aSR4SR4aRS7aSR)-5-oxofuro23-fisoindole-4-carboxylic acids [O] . Flavien A. A. Toze, Maryana A. Nadirova, Dmitriy F. Mertsalov, 2018

机译：苯基马来酸酐与噻吩基（呋喃基）烯丙基胺的IMDAV反应：（3aSR4RS4aRS7aSR）-5-氧噻吩并-和（3aSR4SR4aRS7aSR）-5-氧呋喃23-的合成和分子结构f异吲哚-4-羧酸
7. Semi-supervised discriminative language modeling for Turkish ASR [O] . Çelebi, Arda, Sak, Hasim, Dikici, Erinç, 2012

机译：土耳其asR的半监督判别语言建模

Semi-Supervised Acoustic Model Retraining for Medical ASR

摘要

著录项

相似文献

相关主题

期刊订阅