首页> 外文学位 >Reducing computation in speaker recognition systems using a tree-structured universal background model.
【24h】

Reducing computation in speaker recognition systems using a tree-structured universal background model.

机译:使用树型通用背景模型来减少说话人识别系统中的计算。

获取原文
获取原文并翻译 | 示例

摘要

State-of-the-art speaker recognition systems utilize speaker models that are derived from an adapted universal background model (UBM) in the form of a Gaussian mixture model (GMM). This is true for GMM supervector systems, joint factor analysis systems, and most recently i-vector systems. In all of these systems, the calculation of posterior probabilities and of the sufficient statistics for the weight, mean, and covariance parameters represent a computational bottleneck in both enrollment and testing. In this dissertation, we have developed a method that utilizes a lower resolution GMM hash developed from clusters of GMM-UBM mixture component densities in order to reduce the computational load required. In the adaptation step we score the feature vectors against the hash and calculate the a posteriori probabilities and update the statistics exclusively for mixture components belonging to appropriate clusters.;Each cluster is a grouping of multivariate normal distributions and is modeled by a single multivariate distribution. As such, the set of multivariate normal distributions representing the different clusters also form a GMM. This GMM is referred to as a hash GMM which can be considered a lower resolution representation of the GMM-UBM. The mapping that associates the components of the hash GMM with components of the original GMM-UBM is referred to as a shortlist.;This research investigates various methods of clustering the components of the GMM-UBM and forming hash GMMs. Of five different methods that are presented, one method--Gaussian mixture reduction--outperforms the other methods in terms of reducing computation while preserving recognition accuracy. This method of Gaussian reduction iteratively reduces the size of a GMM by successively merging pairs of component densities using a metric based on the Kullback-Leibler divergence.;Evaluated with a Gaussian mean supervector SVM system and a single layer hash, our research achieves a factor of 2.77 reduction in a posteriori probability calculations with no loss in recognition when using a 250 component GMM-UBM. When clustering was implemented with a 1024 component UBM, we achieved a computation reduction of 5 x with no loss in accuracy and a reduction by a factor of 10x with less than 2.4% relative degradation in EER.;This hash system is extended in this research by employing a tree-structured GMM-UBM which uses Runnalls' Gaussian mixture reduction technique at multiple hierarchical layers, in order to further reduce the number of these probabilistic alignment calculations. With this tree-structured hash, we can reduce this computation by a factor of 14x while incurring less than 5% relative degradation of equal error rate (EER) with a state-of-the-art i-vector system.
机译:最新的说话人识别系统利用说话人模型,这些说话人模型是从自适应通用背景模型(UBM)导出的,呈高斯混合模型(GMM)的形式。对于GMM超向量系统,联合因子分析系统和最新的i-vector系统而言,这是正确的。在所有这些系统中,后验概率的计算以及权重,均值和协方差参数的足够统计量代表了注册和测试的计算瓶颈。在本文中,我们开发了一种方法,该方法利用从GMM-UBM混合组分密度簇中开发的较低分辨率GMM哈希来减少所需的计算负荷。在适应步骤中,我们针对散列对特征向量进行评分,并计算后验概率,并专门更新属于适当聚类的混合成分的统计信息。每个聚类是一组多元正态分布,并由一个多元分布建模。这样,代表不同聚类的多元正态分布集合也形成了GMM。该GMM称为哈希GMM,可以视为GMM-UBM的较低分辨率表示。将哈希GMM的组件与原始GMM-UBM的组件相关联的映射称为短名单。;本研究研究了各种将GMM-UBM的组件聚类并形成哈希GMM的方法。在提出的五种不同方法中,一种方法(高斯混合约简)在减少计算的同时保持识别精度方面优于其他方法。这种高斯归约方法通过使用基于Kullback-Leibler散度的度量来连续合并成对的组件密度来迭代地减小GMM的大小;通过对高斯平均超向量SVM系统和单层哈希进行评估,我们的研究取得了一个因素使用250分量GMM-UBM时,后验概率计算减少了2.77倍,并且没有损失。当使用1024个组件的UBM进行聚类时,我们实现了5倍的计算精简,而没有精度损失,而EER的相对性能下降了10倍,而EER的相对下降小于2.4%。通过使用树结构的GMM-UBM,该结构在多个层次层使用Runnalls的高斯混合约简技术,以进一步减少这些概率对齐计算的数量。使用这种树状结构的哈希,我们可以将计算减少14倍,同时使用最新的i-vector系统产生的等错误率(EER)相对下降不到5%。

著录项

  • 作者

    McClanahan, Richard Daniel.;

  • 作者单位

    New Mexico State University.;

  • 授予单位 New Mexico State University.;
  • 学科 Electrical engineering.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 174 p.
  • 总页数 174
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:53:30

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号