首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Factor Analysis of Auto-Associative Neural Networks With Application in Speaker Verification
【24h】

Factor Analysis of Auto-Associative Neural Networks With Application in Speaker Verification

机译:自联想神经网络的因子分析及其在说话人验证中的应用

获取原文
获取原文并翻译 | 示例

摘要

Auto-associative neural network (AANN) is a fully connected feed-forward neural network, trained to reconstruct its input at its output through a hidden compression layer, which has fewer numbers of nodes than the dimensionality of input. AANNs are used to model speakers in speaker verification, where a speaker-specific AANN model is obtained by adapting (or retraining) the universal background model (UBM) AANN, an AANN trained on multiple held out speakers, using corresponding speaker data. When the amount of speaker data is limited, this adaptation procedure may lead to overfitting as all the parameters of UBM-AANN are adapted. In this paper, we introduce and develop the factor analysis theory of AANNs to alleviate this problem. We hypothesize that only the weight matrix connecting the last nonlinear hidden layer and the output layer is speaker-specific, and further restrict it to a common low-dimensional subspace during adaptation. The subspace is learned using large amounts of development data, and is held fixed during adaptation. Thus, only the coordinates in a subspace, also known as i-vector, need to be estimated using speaker-specific data. The update equations are derived for learning both the common low-dimensional subspace and the i-vectors corresponding to speakers in the subspace. The resultant i-vector representation is used as a feature for the probabilistic linear discriminant analysis model. The proposed system shows promising results on the NIST-08 speaker recognition evaluation (SRE), and yields a 23% relative improvement in equal error rate over the previously proposed weighted least squares-based subspace AANNs system. The experiments on NIST-10 SRE confirm that these improvements are consistent and generalize across datasets.
机译:自缔合神经网络(AANN)是一个完全连接的前馈神经网络,经过训练可通过隐藏压缩层在其输出处重建其输入,该隐藏压缩层的节点数少于输入维数。 AANN用于在说话人验证中对说话人建模,其中,通过使用相应的说话人数据适应(或再训练)通用背景模型(UBM)AANN来获得特定于说话人的AANN模型,即通用背景模型(UBM)AANN。当扬声器数据的数量受到限制时,由于对UBM-AANN的所有参数都进行了调整,因此这种调整过程可能导致过度拟合。在本文中,我们介绍并发展了AANN的因子分析理论来缓解这一问题。我们假设只有连接最后一个非线性隐藏层和输出层的权重矩阵是特定于说话者的,并且在自适应过程中将其进一步限制在一个公共的低维子空间中。子空间是使用大量开发数据来学习的,并且在适配期间保持不变。因此,仅需要使用说话者专用数据估计子空间中的坐标(也称为i矢量)。导出更新方程以学习共同的低维子空间和与子空间中的说话者相对应的i矢量。所得的i向量表示形式用作概率线性判别分析模型的特征。所提出的系统在NIST-08说话人识别评估(SRE)上显示出令人鼓舞的结果,并且与先前提出的基于加权最小二乘的加权子空间ANNS系统相比,在均等错误率方面产生了23%的相对改进。在NIST-10 SRE上进行的实验证实,这些改进是一致的,并且可以跨数据集推广。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号