Nearest neighbor based i-vector normalization for robust speaker recognition under unseen channel conditions

机译：基于最近邻的i-vector归一化，可在看不见的信道条件下实现可靠的说话人识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many state-of-the-art speaker recognition engines use i-vectors to represent variable-length acoustic signals in a fixed low-dimensional total variability subspace. While such systems perform well under seen channel conditions, their performance greatly degrades under unseen channel scenarios. Accordingly, rapid adaptation of i-vector systems to unseen conditions has recently attracted significant research effort from the community. To mitigate this mismatch, in this paper we propose nearest neighbor based i-vector mean normalization (NN-IMN) and i-vector smoothing (IS) for unsupervised adaptation to unseen channel conditions within a state-of-the-art i-vector/PLDA speaker verification framework. A major advantage of the approach is its ability to handle multiple unseen channels without explicit retraining or clustering. Our observations on the DARPA Robust Automatic Transcription of Speech (RATS) speaker recognition task suggest that part of the distortion caused by an unseen channel may be modeled as an offset in the i-vector space. Hence, the proposed nearest neighbor based normalization technique is formulated to compensate for such a shift. Experimental results with the NN based normalized i-vectors indicate that, on average, we can recover 46% of the total performance degradation due to unseen channel conditions.

机译：许多最先进的说话人识别引擎都使用i矢量来表示固定的低维总可变性子空间中的可变长度声音信号。尽管此类系统在可见的信道条件下表现良好，但在看不见的信道情况下其性能会大大降低。因此，i-矢量系统快速适应看不见的状况最近吸引了来自社区的大量研究工作。为了缓解这种不匹配，在本文中，我们提出了基于最近邻的i向量均值归一化（NN-IMN）和i向量平滑化（IS），以在最新i向量中无监督地适应看不见的信道条件/ PLDA说话者验证框架。该方法的主要优点是它能够处理多个看不见的频道，而无需进行明确的重新训练或聚类。我们对DARPA语音自动转录（RATS）说话人识别任务的观察表明，由看不见的声道引起的部分失真可以建模为i向量空间中的偏移量。因此，所提出的基于最近邻的归一化技术被公式化以补偿这种偏移。基于NN的归一化i向量的实验结果表明，平均而言，由于看不见的信道条件，我们可以恢复总性能下降的46％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2015年|4684-4688|共5页
会议地点
作者
Zhu Weizhong; Sadjadi Seyed Omid; Pelecanos Jason W.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
PLDA; i-vector; nearest neighbor; speaker recognition; unsupervised adaptation;

机译：PLDA; i-向量;最近邻;说话人识别;无监督适应;

相似文献

外文文献
中文文献
专利

1. Source-Normalized LDA for Robust Speaker Recognition Using i-Vectors From Multiple Speech Sources [J] . McLaren M., van Leeuwen D. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第3期

机译：源归一化LDA，用于使用来自多个语音源的i矢量进行鲁棒的说话人识别
2. Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors [J] . Maghsoodi Nooshin, Sameti Hossein, Zeinal Hossein, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第11期

机译：基于不确定性归一化HMM的i向量的带有随机数字字符串的说话人识别
3. I-vector based speaker recognition using advanced channel compensation techniques [J] . Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Computer speech and language . 2014,第1期

机译：使用高级通道补偿技术的基于I矢量的说话人识别
4. Nearest neighbor based i-vector normalization for robust speaker recognition under unseen channel conditions [C] . W. Zhu, S. O. Sadjadi, J. W. Pelecanos IEEE International Conference on Acoustics, Speech and Signal Processing . 2015

机译：基于邻邻的I形载体归一化，用于在看不见的频道条件下的强大扬声器识别
5. The modified-mean cepstral mean normalization (MMCMN) method for channel-robust automatic speaker recognition. [D] . Garcia, Alvin A. 2002

机译：改进的均值倒谱均值归一化（MMCMN）方法用于声道鲁棒性自动说话人识别。
6. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification [O] . Mi Li, Hongpei Xu, Xingwang Liu, -1

机译：使用K近邻分类法从多通道EEG信号进行情感识别
7. Improving short utterance based I-vector speaker recognition using source and utterance-duration normalization techniques [O] . Kanagasundaram Ahilan, Dean David, Gonzalez-Dominguez Javier, 2013

机译：使用源和话语持续时间归一化技术改进基于短话语的I矢量说话人识别
8. Trial-Based Calibration for Speaker Recognition in Unseen Conditions. [R] . McLaren, M., Lawson, A., Ferrer, L., 2014

机译：不可见条件下说话人识别的基于试验的校准。

Nearest neighbor based i-vector normalization for robust speaker recognition under unseen channel conditions

摘要

著录项

相似文献

相关主题

期刊订阅