首页> 外文会议>Workshop on Automatic Speech Recognition and Understanding >Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors

【24h】

Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors

机译：使用i-vectors的神经网络声学模型的扬声器适应

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose to adapt deep neural network (DNN) acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the network in parallel with the regular acoustic features for ASR. For both training and test, the i-vector for a given speaker is concatenated to every frame belonging to that speaker and changes across different speakers. Experimental results on a Switchboard 300 hours corpus show that DNNs trained on speaker independent features and i-vectors achieve a 10% relative improvement in word error rate (WER) over networks trained on speaker independent features only. These networks are comparable in performance to DNNs trained on speaker-adapted features (with VTLN and FMLLR) with the advantage that only one decoding pass is needed. Furthermore, networks trained on speaker-adapted features and i-vectors achieve a 5-6% relative improvement in WER after hessian-free sequence training over networks trained on speaker-adapted features only.

机译：我们建议通过将扬声器身份向量（I-Viptors）作为输入特征与ASR的常规声学特征并行，将扬声器标识等向量（I-Viptors）作为输入特征的输入特征向网络调整到目标扬声器的深度神经网络（DNN）声学模型。对于培训和测试，给定扬声器的I形载体将与属于该扬声器的每个帧连接到不同扬声器的每个帧。在交换机300小时的实验结果表明，DNN训练在扬声器独立特征和I-Vovors上仅在讲话者独立特征上培训的网络中获得了10％的相对改善，通过网络训练的网络中的单词错误率（WER）。这些网络在扬声器适应特征（具有VTLN和FMLLR）上培训的DNN的性能相当，其优点仅需要一个解码通过。此外，在讲话者的自由序列训练仅在讲述扬声器适应的功能上的网络训练后，网络培训的网络在扬声器适应的功能和I-vers达到了5-6％的相对改善。

著录项

来源
《Workshop on Automatic Speech Recognition and Understanding 》|2013年||共5页
会议地点
作者
George Saon; Hagen Soltau; David Nahamoo; Michael Picheny;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912.3-532;
关键词
DNN; WER; VTLN;

机译：DNN;谁;vtln;

相似文献

外文文献
中文文献
专利

1. I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models [J] . Penny Karanasou, Chunyang Wu, Mark Gales, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017 ,第4期

机译：I-向量和结构化神经网络可快速适应声学模型
2. Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models [J] . Zeinali Hossein, Sameti Hossein, Burget Lukáš, Computer speech and language . 2017 ,第nova期

机译：基于i向量，神经网络和隐马尔可夫模型的文本相关说话人验证
3. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018 ,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
4. Speaker adaptation of neural network acoustic models using i-vectors [C] . Saon George, Soltau Hagen, Nahamoo David, IEEE Workshop on Automatic Speech Recognition and Understanding . 2013

机译：使用i向量的神经网络声学模型的说话人适应
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Modeling and Sensitivity Analysis of Acoustic Release of Doxorubicin from Unstabilized Pluronic P105 using an Artificial Neural Network Model [O] . Ghaleb A. Husseini, Nabil M. Abdel-Jabbar, Farouq S. Mjalli, -1

机译：人工神经网络模型对不稳定的Pluronic P105中阿霉素声释放的建模和敏感性分析
7. Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition [O] . Li Jingjie, McLoughlin Ian Vince, Liu Cong, 2016

机译：具有判别性说话人身份的模型自适应的多任务深度神经网络声学模型用于耳语识别

Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors

摘要

著录项

相似文献

相关主题

期刊订阅