An Unsuper vised Adaptation Method for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition

Yeming Xiao; Yujing Si; Ji Xu; Jielin Pan; Yonghong Yan

首页> 外文期刊>Journal of information and computational science >An Unsuper vised Adaptation Method for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition

【24h】

An Unsuper vised Adaptation Method for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition

机译：基于深度神经网络的大词汇量连续语音识别的无监督自适应方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently, the Deep Neural Network (DNN) to acoustic modeling has achieved great success on Large Vocabulary Continuous Speech Recognition (LVCSR) tasks. However, the performance of the DNN-based Automatic Speech Recognition (ASR) systems still surfer greatly from the mismatch between training and testing in real applications. In the past, commonly used methods for DNN-based acoustic model adaptation have focused on supervised methods such as affine transformation, regularize training. There has been little work on unsupervised adaptation methods for DNN. However, in many cases, it is very expensive to get the matched data with transcription, but there are tremendous unlabelled data available. In this paper, a novel unsupervised adaptation approach is proposed to mitigate the effects of the mismatch. To be specifically, the original DNN is adapted with these acoustic observations of the unlabelled data, and the boosted posterior probabilities generated with the original DNN are used as training targets. With around 1000 hour unlabelled data used for adaptation, experiments results on a Mandarin voice search recognition task demonstrate the effectiveness of the proposed adaptation technique. Compared to the baseline, the adapted DNN achieve a 10% relative Character Error Rate (CER) reduction.

机译：最近，用于声学建模的深度神经网络（DNN）在大词汇量连续语音识别（LVCSR）任务上取得了巨大成功。但是，基于DNN的自动语音识别（ASR）系统的性能仍然大大得益于实际应用中训练和测试之间的不匹配。过去，基于DNN的声学模型自适应的常用方法主要集中在仿射变换，正则化训练等监督方法上。关于DNN的无监督自适应方法的工作很少。但是，在许多情况下，通过转录获得匹配的数据非常昂贵，但是有大量的未标记数据可用。在本文中，提出了一种新颖的无监督自适应方法来减轻失配的影响。具体而言，原始DNN会通过对未标记数据的这些声学观察进行调整，并且原始DNN生成的增强后验概率将用作训练目标。利用大约1000小时未标记的数据进行自适应，普通话语音搜索识别任务的实验结果证明了所提出的自适应技术的有效性。与基线相比，自适应DNN的相对字符错误率（CER）降低了10％。

著录项

来源
《Journal of information and computational science》 |2014年第14期|4889-4899|共11页
作者
Yeming Xiao; Yujing Si; Ji Xu; Jielin Pan; Yonghong Yan;
展开▼
作者单位

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics Chinese Academy of Sciences, Beijing 100190, China;

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics Chinese Academy of Sciences, Beijing 100190, China;

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics Chinese Academy of Sciences, Beijing 100190, China;

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics Chinese Academy of Sciences, Beijing 100190, China;

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics Chinese Academy of Sciences, Beijing 100190, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech Recognition; Deep Neural Network; Unsupervised Adaptation; Regularized Training;

机译：语音识别;深度神经网络无监督适应;正规培训;

相似文献

外文文献
中文文献
专利

1. Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech [J] . Biswajit Das, Sandipan Mandal, Pabitra Mitra, Pattern recognition letters . 2013,第3期

机译：说话人适应技术对语音的老化识别：中词汇连续孟加拉语语音研究
2. Comparison of Non-native Speaker Adaptations for Large Vocabulary Continuous Mandarin Speech Recognition [J] . Hong Wei, Jian Yang, Yuanyuan Pu Zhengpeng Zhao International Journal of Information Technology . 2005,第07期

机译：大词汇量连续汉语普通话语音识别的非母语说话人适应性比较
3. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition [J] . Jibin Wu, Emre Y?lmaz, Malu Zhang, Frontiers in Neuroscience . 2020,第4期

机译：大型词汇自动语音识别深尖峰神经网络
4. A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition [C] . Yeming Xiao, Zhen Zhang, Shang Cai, Annual conference of the International Speech Communication Association . 2012

机译：基于特定任务的深度神经网络大词汇量连续语音识别的自适应尝试
5. An Error Detection and Correction Framework to Improve Large Vocabulary Continuous Speech Recognition [D] . Zhou, Zhengyu 2009

机译：一种提高大词汇量连续语音识别能力的错误检测与纠正框架
6. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition [O] . Jibin Wu, Emre Yılmaz, Malu Zhang, 2020

机译：大型词汇自动语音识别深尖峰神经网络
7. Investigating modulation spectrogram features for deep neural network-based automatic speech recognition [O] . Baby Deepak, Van hamme Hugo 2015

机译：研究基于深度神经网络的自动语音识别的调制频谱图功能

An Unsuper vised Adaptation Method for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅