【24h】

New Issues on Nonlinear Discriminant Analysis for Speaker Recognition

机译:说话人识别的非线性判别分析新问题

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. We present different system architectures to extract features that are more invariant to non-speakers-related conditions such as handset types and channel effects: (a) the first approach uses a Time Delay Neural Network (TDNN). (b) The second approach uses the Dynamic Decay Adjustment (DDA) algorithm for Radial basis Function (RBF) method. (c) We finally combined linearly the normalized scores TDNN/RBF architectures to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features to a lower-dimensional feature set. The architecture proposed takes into account both the temporal changing of the speech signal and the powerful of the neural networks (NN). The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. The transformed features are used to train a GMM-based speaker identification system. We have trained and tested the different proposed architectures on 45 speaker's SPIDRE corpus of telephone conversations. The results show an improvement of more than 12% compared to our standard system.
机译:在本文中,我们研究了一种非线性判别分析(NLDA)技术,该技术提取了说话人区分特征集。我们提出了不同的系统架构,以提取与非扬声器相关条件(如手机类型和信道效果)更加不变的特征:(a)第一种方法使用时延神经网络(TDNN)。 (b)第二种方法将动态衰减调整(DDA)算法用于径向基函数(RBF)方法。 (c)最后,我们通过非线性地将一大组声学特征非线性投影到低维特征集,从而将归一化得分TDNN / RBF体系结构进行线性组合,以最大程度地分散说话者之间的距离。所提出的架构同时考虑了语音信号的时间变化和神经网络(NN)的功能。提取的功能经过优化,可以区分说话者,并且对于不匹配的训练和测试条件具有鲁棒性。转换后的功能用于训练基于GMM的说话人识别系统。我们已经在45个说话者的SPIDRE电话交谈语料库中培训和测试了不同的建议体系结构。结果表明,与我们的标准系统相比,改进了12%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号