Semi-supervised Training for End-to-end Models via Weak Distillation

机译：通过弱蒸馏对端到端模型进行半监督训练

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

End-to-end (E2E) models are a promising research direction in speech recognition, as the single all-neural E2E system offers a much simpler and more compact solution compared to a conventional model, which has a separate acoustic (AM), pronunciation (PM) and language model (LM). However, it has been noted that E2E models perform poorly on tail words and proper nouns, likely because the end-to-end optimization requires joint audio-text pairs, and does not take advantage of additional lexicons and large amounts of text-only data used to train the LMs in conventional models. There has been numerous efforts in training an RNN-LM on text-only data and fusing it into the end-to-end model. In this work, we contrast this approach to training the E2E model with audio-text pairs generated from unsupervised speech data. To target the proper noun issue specifically, we adopt a Part-of-Speech (POS) tagger to filter the unsupervised data to use only those with proper nouns. We show that training with filtered unsupervised-data provides up to a 13% relative reduction in word-error-rate (WER), and when used in conjunction with a cold-fusion RNN-LM, up to a 17% relative improvement.

机译：端到端（E2E）模型是语音识别的有前途的研究方向，因为与具有分离的声学（AM），语音发音的传统模型相比，单一的全神经E2E系统提供了更为简单，紧凑的解决方案（PM）和语言模型（LM）。但是，已经注意到，E2E模型在尾部单词和专有名词上的性能较差，这可能是因为端到端优化需要联合音频-文本对，并且没有利用额外的词典和大量纯文本数据用于训练传统模型中的LM。在针对纯文本数据进行RNN-LM训练并将其融合到端到端模型中，已经进行了许多工作。在这项工作中，我们将这种方法与通过无监督语音数据生成的音频-文本对训练E2E模型进行对比。为了专门针对专有名词问题，我们采用词性（POS）标记器对非监督数据进行过滤，以仅使用具有专有名词的数据。我们显示，使用过滤后的无监督数据进行训练可以使字错误率（WER）降低13％，并且与冷融合RNN-LM结合使用时，可以相对提高17％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|2837-2841|共5页
会议地点
作者
Bo Li; Tara N. Sainath; Ruoming Pang; Zelin Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Data models; Google; Speech recognition; Training data; Context modeling; Decoding;

机译：训练;数据模型; Google;语音识别;训练数据;上下文建模;解码;

相似文献

外文文献
中文文献
专利

1. Modeling of antenna resonant frequency based on co-training of semi-supervised Gaussian process with different kernel functions [J] . Jing Gao, Yu-Bo Tian, Xue-Zhi Chen International journal of RF and microwave computer-aided engineering . 2021,第6期

机译：基于不同内核函数的半监控高斯过程共同训练的天线谐振频率建模
2. Learning Adaptive Semi-Supervised Multi-Output Soft-Sensors With Co-Training of Heterogeneous Models [J] . Li Dong, Huang Daoping, Yu Guangping, Quality Control, Transactions . 2020,第期

机译：学习自适应半监控多输出软传感器，具有异构模型的共同培训
3. Low-Dimensional Non-Rigid Image Registration Using Statistical Deformation Models From Semi-Supervised Training Data [J] . Onofrey John /A/., Papademetris Xenophon, Staib Lawrence H. Medical Imaging, IEEE Transactions on . 2015,第7期

机译：使用来自半监督训练数据的统计变形模型进行低维非刚性图像配准
4. Semi-supervised Training for End-to-end Models via Weak Distillation [C] . Bo Li, Tara N. Sainath, Ruoming Pang, IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：通过弱蒸馏对端到端模型进行半监督培训
5. Semi-supervised training of models for appearance-based statistical object detection methods. [D] . Rosenberg, Charles Joseph. 2004

机译：基于外观的统计对象检测方法的模型的半监督训练。
6. Robust Semi-Supervised Traffic Sign Recognition via Self-Training and Weakly-Supervised Learning [O] . Obed Tettey Nartey, Guowu Yang, Sarpong Kwadwo Asare, 2020

机译：通过自我训练和弱监督学习实现可靠的半监督交通标志识别
7. Semi-Supervised ASR by End-to-End Self-Training [O] . Yang Chen, Weiran Wang, Chao Wang 2020

机译：通过端到端自我培训半监督ASR

Semi-supervised Training for End-to-end Models via Weak Distillation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅