Increasing discriminative capability on MAP-based mapping function estimation for acoustic model adaptation

机译：用于声学模型自适应的基于MAP的映射函数估计的判别能力不断增强

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this study, we propose increasing discriminative power on the maximum a posteriori (MAP)-based mapping function estimation for acoustic model adaptation. Based on the effective and stable learning advantages of MAP-based estimation, we incorporate a discriminative term and derive a new objective function. By applying the new function for online mapping function estimation, we developed discriminative maximum a posteriori (DMAP) linear regression (DMAPLR) and DMAP-based ensemble speaker and speaking environment modeling (DMAP-based ESSEM). We evaluate the DMAPLR and DMAP-based ESSEM on the Aurora-2 task in a supervised adaptation mode. The experimental results show that both DMAPLR and DMAP-based ESSEM consistently provide improvements over their ML-based and MAP-based counterparts irrespective of using one, two, or three adaptation utterances. From the improvements, we confirm the strong effect of increasing discriminative capability on the MAP-based mapping function estimation. Moreover, we verify that including multiple knowledge sources in the objective function can efficiently enhance model adaptation performance. When compared with the baseline result, DMAP-ESSEM achieves a 15.96% (9.21% to 7.74%) average word error rate (WER) reduction using only one adaptation utterance.

机译：在这项研究中，我们建议针对基于声学模型适应性的基于后验（MAP）的最大映射函数估计，提高判别能力。基于基于MAP的评估的有效和稳定的学习优势，我们引入了判别项，并得出了新的目标函数。通过将新功能应用到在线映射功能估计中，我们开发了判别最大后验（DMAP）线性回归（DMAPLR）和基于DMAP的整体演讲者和说话环境建模（基于DMAP的ESSEM）。我们在有监督的适应模式下对Aurora-2任务评估基于DMAPLR和基于DMAP的ESSEM。实验结果表明，无论使用一种，两种或三种适应话语，DMAPLR和基于DMAP的ESSEM都始终比其基于ML和基于MAP的同行提供改进。从这些改进中，我们确认了增加判别能力对基于MAP的映射函数估计的强大影响。此外，我们验证了在目标函数中包含多个知识源可以有效地提高模型自适应性能。与基线结果相比，DMAP-ESSEM仅使用一种适应性话语即可减少15.96％（9.21％至7.74％）的平均单词错误率（WER）。

著录项

来源
《2011 IEEE International Conference on Acoustics, Speech and Signal Processing》|2011年|p.5320-5323|共4页
会议地点
作者
Tsao Yu; Isotani Ryosuke; Kawai Hisashi; Nakamura Satoshi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信理论;
关键词
Automatic speech recognition; ESSEM; MAP-based ESSEM; MAPLR; MLLR; discriminative training;

机译：自动语音识别ESSEM基于MAP的ESSEM MAPLR MLLR歧视性训练;

相似文献

外文文献
中文文献
专利

1. Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation [J] . Bowen Zhou, Hansen J.H.L. IEEE Transactions on Speech and Audio Proceessing . 2005,第4期

机译：基于特征空间映射的快速判别声学模型，用于说话人快速适应
2. Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation [J] . Yu Tsao, Xugang Lu, Paul Dixon, Computer speech and language . 2014,第3期

机译：将声学环境的本地信息整合到基于MAP的特征补偿和声学模型自适应中
3. Correctness-Adjusted Unsupervised Discriminative Acoustic Model Adaptation [J] . Gibson M., Hain T. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第10期

机译：正确性调整的无监督判别声学模型自适应
4. INCREASING DISCRIMINATIVE CAPABILITY ON MAP-BASED MAPPING FUNCTION ESTIMATION FOR ACOUSTIC MODEL ADAPTATION [C] . Yu Tsao, Ryosuke Isotani, Hisashi Kawai, IEEE International Conference on Acoustics, Speech and Signal Processing . 2011

机译：提高基于地图的映射功能估计对声学模型适应的判别能力
5. Discriminative training for speaker adaptation and minimum Bayes risk estimation in large vocabulary speech recognition. [D] . Doumpiotis, Vlasios. 2005

机译：大词汇量语音识别中的说话人适应性和最低贝叶斯风险估计的判别训练。
6. Time-Frequency Distribution Map-Based Convolutional Neural Network (CNN) Model for Underwater Pipeline Leakage Detection Using Acoustic Signals [O] . Yingchun Xie, Yucheng Xiao, Xuyan Liu, 2020

机译：基于时频分布地图的基于地图的卷积神经网络（CNN）模型用于使用声信号进行水下管道泄漏检测
7. Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation [O] . Tsao Yu, Lu Xugang, Dixon Paul, 2014

机译：将声学环境的本地信息整合到基于MAP的特征补偿和声学模型自适应中
8. Using Model-Based Parameter Estimation to Increase the Efficiency of Computing Electromagnetic Transfer Functions [R] . Burke, G. J. , Miller, E. K. , Chakrabarti, S. , 1989

机译：利用基于模型的参数估计提高电磁传递函数的计算效率

Increasing discriminative capability on MAP-based mapping function estimation for acoustic model adaptation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅