首页> 外文学位 >Reducing computation in speaker recognition systems using a tree-structured universal background model.

【24h】

Reducing computation in speaker recognition systems using a tree-structured universal background model.

机译：使用树型通用背景模型来减少说话人识别系统中的计算。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

State-of-the-art speaker recognition systems utilize speaker models that are derived from an adapted universal background model (UBM) in the form of a Gaussian mixture model (GMM). This is true for GMM supervector systems, joint factor analysis systems, and most recently i-vector systems. In all of these systems, the calculation of posterior probabilities and of the sufficient statistics for the weight, mean, and covariance parameters represent a computational bottleneck in both enrollment and testing. In this dissertation, we have developed a method that utilizes a lower resolution GMM hash developed from clusters of GMM-UBM mixture component densities in order to reduce the computational load required. In the adaptation step we score the feature vectors against the hash and calculate the a posteriori probabilities and update the statistics exclusively for mixture components belonging to appropriate clusters.;Each cluster is a grouping of multivariate normal distributions and is modeled by a single multivariate distribution. As such, the set of multivariate normal distributions representing the different clusters also form a GMM. This GMM is referred to as a hash GMM which can be considered a lower resolution representation of the GMM-UBM. The mapping that associates the components of the hash GMM with components of the original GMM-UBM is referred to as a shortlist.;This research investigates various methods of clustering the components of the GMM-UBM and forming hash GMMs. Of five different methods that are presented, one method--Gaussian mixture reduction--outperforms the other methods in terms of reducing computation while preserving recognition accuracy. This method of Gaussian reduction iteratively reduces the size of a GMM by successively merging pairs of component densities using a metric based on the Kullback-Leibler divergence.;Evaluated with a Gaussian mean supervector SVM system and a single layer hash, our research achieves a factor of 2.77 reduction in a posteriori probability calculations with no loss in recognition when using a 250 component GMM-UBM. When clustering was implemented with a 1024 component UBM, we achieved a computation reduction of 5 x with no loss in accuracy and a reduction by a factor of 10x with less than 2.4% relative degradation in EER.;This hash system is extended in this research by employing a tree-structured GMM-UBM which uses Runnalls' Gaussian mixture reduction technique at multiple hierarchical layers, in order to further reduce the number of these probabilistic alignment calculations. With this tree-structured hash, we can reduce this computation by a factor of 14x while incurring less than 5% relative degradation of equal error rate (EER) with a state-of-the-art i-vector system.

机译：最新的说话人识别系统利用说话人模型，这些说话人模型是从自适应通用背景模型（UBM）导出的，呈高斯混合模型（GMM）的形式。对于GMM超向量系统，联合因子分析系统和最新的i-vector系统而言，这是正确的。在所有这些系统中，后验概率的计算以及权重，均值和协方差参数的足够统计量代表了注册和测试的计算瓶颈。在本文中，我们开发了一种方法，该方法利用从GMM-UBM混合组分密度簇中开发的较低分辨率GMM哈希来减少所需的计算负荷。在适应步骤中，我们针对散列对特征向量进行评分，并计算后验概率，并专门更新属于适当聚类的混合成分的统计信息。每个聚类是一组多元正态分布，并由一个多元分布建模。这样，代表不同聚类的多元正态分布集合也形成了GMM。该GMM称为哈希GMM，可以视为GMM-UBM的较低分辨率表示。将哈希GMM的组件与原始GMM-UBM的组件相关联的映射称为短名单。;本研究研究了各种将GMM-UBM的组件聚类并形成哈希GMM的方法。在提出的五种不同方法中，一种方法（高斯混合约简）在减少计算的同时保持识别精度方面优于其他方法。这种高斯归约方法通过使用基于Kullback-Leibler散度的度量来连续合并成对的组件密度来迭代地减小GMM的大小;通过对高斯平均超向量SVM系统和单层哈希进行评估，我们的研究取得了一个因素使用250分量GMM-UBM时，后验概率计算减少了2.77倍，并且没有损失。当使用1024个组件的UBM进行聚类时，我们实现了5倍的计算精简，而没有精度损失，而EER的相对性能下降了10倍，而EER的相对下降小于2.4％。通过使用树结构的GMM-UBM，该结构在多个层次层使用Runnalls的高斯混合约简技术，以进一步减少这些概率对齐计算的数量。使用这种树状结构的哈希，我们可以将计算减少14倍，同时使用最新的i-vector系统产生的等错误率（EER）相对下降不到5％。

著录项

作者
McClanahan, Richard Daniel.;
展开▼
作者单位

New Mexico State University.;

展开▼
授予单位 New Mexico State University.;
学科 Electrical engineering.
学位 Ph.D.
年度 2014
页码 174 p.
总页数 174
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:53:30

相似文献

外文文献
中文文献
专利

1. Histogram equalization using a reduced feature set of background speakers’ utterances for speaker recognition [J] . Myung-jae?Kim, Il-ho?Yang, Min-seok?Kim, Frontiers of Information Technology & Electronic Engineering . 2017,第5期

机译：使用减少的背景说话者特征集进行直方图均衡以识别说话者
2. Histogram equalization using a reduced feature set of background speakers' utterances for speaker recognition [J] . Myung-jae KIM, Il-ho YANG, Min-seok KIM, 浙江大学学报（英文版）（C辑：计算机与电子） . 2017,第005期

机译：使用减少的背景说话者说话特征集进行直方图均衡以识别说话者
3. Biometric template protection for speaker recognition based on universal background models [J] . Billeb Stefan, Rathgeb Christian, Reininger Herbert, Biometrics, IET . 2015,第2期

机译：基于通用背景模型的说话人识别生物模板保护
4. Reduced Universal Background Model for Speech Recognition and Identification System [C] . Lachachi Nour-Eddine, Adla Abdelkader Mexican conference on pattern Recognition . 2012

机译：语音识别和识别系统的简化通用背景模型
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Circuits Generating Corticomuscular Coherence Investigated Using a Biophysically Based Computational Model. I. Descending Systems [O] . Elizabeth R. Williams, Stuart N. Baker -1

机译：使用基于生物物理的计算模型研究了产生皮层相干性的电路。一降序系统
7. Reduced Universal Background Model for Speech Recognition and Identification System [O] . Lachachi Nour-Eddine, Adla Abdelkader 2012

机译：减少语音识别和识别系统的通用背景模型
8. MIT-LL/IBM 2006 Speaker Recognition System: High-Performance Reduced-Complexity Recognition. [R] . Campbell, W. M., Sturim, D. E., Shen, W., 2006

机译：mIT-LL / IBm 2006说话人识别系统：高性能降低复杂性识别。

Reducing computation in speaker recognition systems using a tree-structured universal background model.

摘要

著录项

相似文献

相关主题

期刊订阅